R Basics Flashcards
What is the key binding on Max to create a new R script?
command + shift + N
how to install a package in R? how to load the package into R?
install.packages(“dslabs”)
library (dslabs)
how to install two packages at the same time?
install.packages(c(“dslabs”, “tidyverse”))
how to install packages in R studio?
Tools –> install pacages –> type in the names
how to see what packages you have installed in R?
installed.packages()
key short cut on Mac for naming a new script or save in R studio
cmd + s
the key binding on Mac to run an entire R script
cmd + shift + return
the key binding on Mac to run a single R script
cmd + return
how to assign a value to a variable?
using the assignment symbol sign “
how to see the value stored in a variable?
- use function “print(xx)” in R then enter
here xx is the name of the variable - type the name of the variable in R then hit enter
what are objects in R? how to show the names of the defined objects in workspace?
Objects are things are stored in named containers in R. they can be variables, functions, etc.
ls() function can show the names of the objects saved in the workspace.
how to get a square root of a number in R?
sqrt(xx)
how to get more explanation about a function?
- type “help(“xx”)” i.e. help(“log”)
2. directly type “?log” in R. don’t need to use quote here.
how to show the arguments of a variable? i.e. the one for log function.
args(log)
then you will get “(x, base=exp(1))”, meaning the base for log is set up as 1, to change the base, we can further type “log(8, base=2)”, then we get the result 3.
we can also type log(8,2) or log (x=8, base=2). we will get the same results.
what are the rules of naming an object?
it has to begin with a letter
it cannot contain a space
if we want to add some comments in script but not to disrupt the logics of coding, what sign we can use to start the comment?
# this sign is not pre-defined by R in any function, so we can use ## to start a reminder note for ourselves in script.
what sign is used to specify an argument?
equal sign =
i.e. log (x=8, base=2)
what does str function mean?
“str” means” structure”
how to access data frame?
use either str or head function
how to access data from columns of a data frame?
using $ sign, which is called accessor. i.e. mvt$Population - here mvt is the name of the dataframe, population is the name of the column
or user [[]], i.e. mvt[[“population”]]
how to know the names of columns of the data frames?
names ()
how to know how many numeric data points for a column?
length()
i.e. length(mvt$population)
what function can help us know if the vector is numeric or character?
class()
if it’s character, it will return “character”
what is vector? how many types of vectors in R?
vector is an object consisting of several entries and can be numeric, character, logical (true or false), factor (categorical) vectors.
loading the dslabs package and the murders dataset
library(dslabs)
data(murders)
determining that the murders dataset is of the “data frame” class
class(murders)
# finding out more about the structure of the object
str(murders)
# showing the first 6 lines of the dataset
head(murders)
using the accessor operator to obtain the population column
murders$population
# displaying the variable names in the murders dataset
names(murders)
# determining how many entries are in a vector pop
# logical vectors are either TRUE or FALSE z
# factors are another type of class class(murders$region) # obtaining the levels of a factor levels(murders$region)
how to use the data of a column to create table?
- assign the data to a letter i.e. x
if a variable type is a factor, we can use nlevel() function to determine the number of levels of a factor.
square brackets is useful for subsetting to access specific elements of a vector. i.e. codes [2], codes[c(1,3)], codes[1:2].
if the entries of a vector are named, they may be accessed by referring to their name. i.e. codes [“canada”], codes [c(“egypt”, “italy”)]
what are the functions that can create vectors?
c()
seq()
what is coercion in R?
a coercion is an attempt by R to be flexible with data types by guessing what was meant when an entry does not match the expected. i.e.
x
what function turns numbers into characters?
as.character()
what function turns characters into numbers?
as.numeric()
what functions to associate numeric vectors with character vectors?
names(numeric vector)
what is the function sort () used for?
it sorts s vector in increasing order
what is the function order () used for?
The function order() produces the indices needed to obtain the sorted vector, e.g. a result of 2 3 1 5 4 means the sorted vector will be produced by listing the 2nd, 3rd, 1st, 5th, and then 4th item of the original vector.
what is the function rank () used for?
The function rank() gives us the ranks of the items in the original vector.
how to get the biggest and smallest value in a vector?
The function max() returns the largest value, while which.max() returns the index of the largest value. The functions min() and which.min() work similarly for minimum values.
how to find the name of the state with the smallest population in a data frame?
- pop
how to know how many NA in a vector?
- use is.na() find out how many NA are there
2. define a variable to this i.e. ind
how to list a state’s list in a decreasing order of temperature?
measure$state[order(measure$temp, decreasing=TRUE)]
here, measure is the file name, state and temp are two columns of data
how to get a sum?
use sum() function
how to create a logical vector that specifies if one variable is less than certain numbers? how to determine the names of the variables that meet this requirements?
how to calculate how many states like this?
for example:
index
how to create a logical vector that meet multiple logical requirements?
use & sign to connect two logical requirements i.e.
west
what the use of which() function?
it indicates which values are true in a vector i.e.
x
how to quickly find the value in a vector?
use index() and which i.e. index
how to find multiple values in a vector?
index
how to check if a vector is contained in another vector?
%in%
i.e. x
which function can change a data table by adding a new column or change the existing one?
mutate()
i.e. mutate (object’s name, the target name after mutation)
which function can filter the data by subsetting rows?
filter()
i.e. filter(murders, rate<0.7)
which function can subset the data by selecting specific columns?
select ()
i.e. new_table
what operator can perform a series of operations by sending the results of one function to another?
%>%
i.e. murders %>% select(state,region,rate) %>% filter (rate<0.7)
Since data.frame () turns characters into factors. which argument can help avoid turning characters to factors?
stringsAsfactors=FALSE
i.e. grades
which function gives rank from the lowest to highest?
how to filter the rows ranked 1-5?
rank(x)
rank (-x) gives the rank from highest to lowest
filter (object name, rank<=5)
how to filter the rows except the ones related to “region of south”? how to count the number of rows?
no_south
how to filter the rows with a variable meeting two requirements at the same time?
using %in% c function i.e.
murders_nw
how to filter the rows with variables meeting multiple requirements?
filter (object name, condition 1 & condition 2)
what is any () function?
any () function takes a vector of logicals and returns true if any of the entries are true
what is all () function?
all() function takes a vector of logicals and returns true if all of the entries are true
how to use if_else function?
if (condition) {perform some expression} else { alternative expression}
it can be also written as ifelse(condition, expression, alternative expression)
How to check if two values are the same?
use identical (x,y) function
how to define a new function that is unique to you to use?
use function() in format function(VARIABLE_NAME){
perform operations on VARIABLE_NAME and calculate VALUE
VALUE
}
i.e.
avg