R FUNCTIONS Flashcards
creates a subset row based on specific criteria
filter()
removes duplicates in the specific columns
distinct()
selects specific rows by position
slice()
pipes the data set with following commands
%>%
reduces a data frame to whatever observations are grouped
summarize()
counts the number of rows in a group based on the variables selected
count()
extracts a single column from a data set
select()
used in summarize(), sets missing values as βNAβ
is.na()
orders values from low to high within a given variable
arrange()
within the arrange() function; orders values from high to low
desc()
creates subsets of the data frame based on the given variable(s), and combines them at the end
group_by()
create a new column based on a command
mutate()
rename a column
rename()
adds total amount of True statements, used with summarize()
sum()
pastes two tables side by side as they are with no changes
bind_cols()
combines a column from y with the matching values with x
left_join()
combines all different data from y to x
full_join()
sets the data base for a visual representation
gg_plot()
assigns the x and y axes, colors, etc
aes()
creates a scatter plot of the data
geom_point()
creates a curve to follow along the mean of a scatter plot
geom_smooth()
creates a bar chart from data for the x AND y axes
geom_col()
creates a boxplot from the min, lower, median, upper, and max
geom_boxplot()
creates a histogram from the data set as the x variable
geom_histogram()
provides a bar graph from the 1 variable set as the x axis
geom_bar()
sets the input as a date format
date()
rounds the date down to nearest unit
floor_date()
rounds the data up to the nearest unit
ceiling_date()