262 lines
7.2 KiB
R
262 lines
7.2 KiB
R
# The goal of this file is to introduce you to the
|
|
# R programming language. Let's start with by unraveling a
|
|
# little mystery!
|
|
|
|
# 1. Run the code below to create the vector 'udacious'.
|
|
# You need to highlight all of the lines of the code and then
|
|
# run it. You should see "udacious" appear in the workspace.
|
|
|
|
udacious <- c("Chris Saden", "Lauren Castellano",
|
|
"Sarah Spikes","Dean Eckles",
|
|
"Andy Brown", "Moira Burke",
|
|
"Kunal Chawla")
|
|
|
|
# You should see something like "chr[1:7]" in the 'Environment'
|
|
# or 'Workspace' tab. This is because you created a 'vector' with
|
|
# 7 names that have a 'type' of character. The arrow-like
|
|
# '<-' symbol is the assignment operator in R, similar to the
|
|
# equal sign '=' in other programming languages. The c() is a
|
|
# generic function that combines arguments, in this case the
|
|
# names of people, to form a vector.
|
|
|
|
# A 'vector' is one of the data types in R. Vectors must contain
|
|
# the same type of data, that is the entries must all be of the
|
|
# same type: character (most programmers call these strings),
|
|
# logical (TRUE or FALSE), or numeric.
|
|
|
|
# Print out the vector udacious by running this next line of code.
|
|
|
|
udacious
|
|
|
|
# Notice how there are numbers next to the output.
|
|
# Each number corresponds to the index of the entry in the vector.
|
|
# Chris Saden is the first entry so [1]
|
|
# Dean Eckles is the fourth entry so [4]
|
|
# Kunal Chawla is the seventh entry so [7]
|
|
|
|
# Depending on the size of you window you may see different numbers
|
|
# in the output.
|
|
|
|
# ANOTHER HELPFUL TIP: You can add values to a vector.
|
|
# Run each line of code one at a time below to see what is happening.
|
|
|
|
numbers <- c(1:10)
|
|
|
|
numbers
|
|
|
|
numbers <- c(numbers, 11:20)
|
|
|
|
numbers
|
|
|
|
|
|
# 2. Replace YOUR_NAME with your actual name in the vector
|
|
# 'udacious' and run the code. Be sure to use quotes around it.
|
|
|
|
udacious <- c("Chris Saden", "Lauren Castellano",
|
|
"Sarah Spikes","Dean Eckles",
|
|
"Andy Brown", "Moira Burke",
|
|
"Kunal Chawla", "Dustin Pianalto")
|
|
|
|
# Notice how R updates 'udacious' in the workspace.
|
|
# It should now say something like 'chr[1:8]'.
|
|
|
|
# 3. Run the following two lines of code. You can highlight both lines
|
|
# of code and run them.
|
|
|
|
mystery = nchar(udacious)
|
|
mystery
|
|
|
|
# You just created a new vector called mystery. What do you
|
|
# think is in this vector? (scroll down for the answer)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# Mystery is a vector that contains the number of characters
|
|
# for each of the names in udacious, including your name.
|
|
|
|
# 4. Run this next line of code.
|
|
|
|
mystery == 11
|
|
|
|
# Here we get a logical (or boolean) vector that tells us
|
|
# which locations or indices in the vector contain a name
|
|
# that has exactly 11 characters.
|
|
|
|
# 5. Let's use this boolean vector, mystery, to subset our
|
|
# udacious vector. What do you think the result will be when
|
|
# running the line of code below?
|
|
|
|
# Think about the output before you run this next line of code.
|
|
# Notice how there are brackets in the code. Brackets are often
|
|
# used in R for subsetting.
|
|
|
|
udacious[mystery == 11]
|
|
|
|
|
|
# Scroll down for the answer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# It's your Udacious Instructors for the course!
|
|
# (and you may be in the output if you're lucky enough
|
|
# to have 11 characters in YOUR_NAME) Either way, we
|
|
# think you're pretty udacious for taking this course.
|
|
|
|
|
|
|
|
|
|
|
|
# 6. Alright, all mystery aside...let's dive into some data!
|
|
# The R installation has a few datasets already built into it
|
|
# that you can play with. Right now, you'll load one of these,
|
|
# which is named mtcars.
|
|
|
|
# Run this next command to load the mtcars data.
|
|
|
|
data(mtcars)
|
|
|
|
|
|
# You should see mtcars appear in the 'Environment' tab with
|
|
# <Promise> listed next to it.
|
|
|
|
# The object (mtcars) appears as a 'Promise' object in the
|
|
# workspace until we run some code that uses the object.
|
|
|
|
# R has stored the mtcars data into a spreadsheet-like object
|
|
# called a data frame. Run the next command to see what variables
|
|
# are in the data set and to fully load the data set as an
|
|
# object in R. You should see <Promise> disappear when you
|
|
# run the next line of code.
|
|
|
|
# Visit http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Promise-objects
|
|
# if you want the expert insight on Promise objects. You won't
|
|
# need to the info on Promise objects to be successful in this course.
|
|
|
|
names(mtcars)
|
|
|
|
# names(mtcars) should output all the variable
|
|
# names in the data set. You might notice that the car names
|
|
# are not a variable in the data set. The car names have been saved
|
|
# as row names. More on this later.
|
|
|
|
# You should also see how many observations (obs.) are in the
|
|
# the data frame and the number of variables on each observation.
|
|
|
|
# 7. To get more information on the data set and the variables
|
|
# run the this next line of code.
|
|
|
|
?mtcars
|
|
|
|
# You can type a '?' before any command or a data set to learn
|
|
# more about it. The details and documentation will appear in
|
|
# the 'Help' tab.
|
|
|
|
|
|
# 8. To print out the data, run this next line as code.
|
|
|
|
mtcars
|
|
|
|
# Scroll up and down in the console to check out the data.
|
|
# This is the entire data frame printed out.
|
|
|
|
# 9. Run these next two functions, one at a time,
|
|
# and see if you can figure out what they do.
|
|
|
|
str(mtcars)
|
|
|
|
dim(mtcars)
|
|
|
|
# Scroll down for the answer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
# The first command, str(mtcars), gives us the structure of the
|
|
# data frame. It lists the variable names, the type of each variable
|
|
# (all of these variables are numerics) and some values for each
|
|
# variable.
|
|
|
|
|
|
# The second command, dim(mtcars), should output '[1] 32 11'
|
|
# to the console. The [1] indicates that 32 is the first value
|
|
# in the output.
|
|
|
|
# R uses 1 to start indexing (AND NOT ZERO BASED INDEXING as is true
|
|
# of many other programming languages.)
|
|
|
|
# 10. Read the documentation for row.names if you're want to know more.
|
|
?row.names
|
|
|
|
# Run this code to see the current row names in the data frame.
|
|
row.names(mtcars)
|
|
|
|
# Run this code to change the row names of the cars to numbers.
|
|
row.names(mtcars) <- c(1:32)
|
|
|
|
# Now print out the data frame by running the code below.
|
|
mtcars
|
|
|
|
# It's tedious to relabel our data frame with the right car names
|
|
# so let's reload the data set and print out the first ten rows.
|
|
|
|
data(mtcars)
|
|
head(mtcars, 10)
|
|
|
|
# The head() function prints out the first six rows of a data frame
|
|
# by default. Run the code below to see.
|
|
head(mtcars)
|
|
|
|
# I think you'll know what this does.
|
|
tail(mtcars, 3)
|
|
|
|
|
|
# 11. We've run nine commands so far:
|
|
# c, nchar, data, str, dim, names, row.names, head, and tail.
|
|
|
|
# All of these commands took some inputs or arguments.
|
|
# To determine if a command takes more arguments or to learn
|
|
# about any default settings, you can look up the documentation
|
|
# using '?' before the command, much like you did to learn about
|
|
# the mtcars data set and the row.names
|
|
|
|
|
|
|
|
# 12. Let's examine our car data more closely. We can access an
|
|
# an individual variable (or column) from the data frame using
|
|
# the '$' sign. Run the code below to print out the variable
|
|
# miles per gallon. This is the mpg column in the data frame.
|
|
|
|
mtcars$mpg
|
|
|
|
# Print out any two other variables to the console.
|
|
|
|
|
|
|
|
# This is a vector containing the mpg (miles per gallon) of
|
|
# the 32 cars. Run this next line of code to get the average mpg for
|
|
# for all the cars. What is it?
|
|
|
|
# Enter this number for the quiz on the Udacity website.
|
|
# https://www.udacity.com/course/viewer#!/c-ud651/l-729069797/e-804129314/m-830829287
|
|
|
|
mean(mtcars$mpg)
|
|
|
|
|
|
|