dplyr Flashcards
a new type of data frame that makes data more easy to work with
tbl
function to create a tbl
tbl_df(mydata)
utilize lookup tables to clean data (similiar to case when)
two
glimpse
shows some data from every column in a tbl
What are the 5 main data manupulation funcion in dyplr
select, mutate, arrange, filter, summarise
arrange()
that reorders the rows according to single or multiple variables
summarise()
which reduces each group to a single row by calculating aggregate measures.
select()
select(data, column1, column 2, etc.)
starts_with(“X”)
returns every column that starts with “X”
ends_with(“X”)
returns every column that ends with “X”
contains(“X”)
returns every column that contains “X”
matches(“X”)
returns every column that matches “X”
“Not equal to” operator
!=
equal to operator
==
%in%
used in the filter clause as a logical operator to test whether a variable is found in a vector.
ex. filter(grades, %in% c(‘a’, ‘b’)) will return only grades that were an a or b