R Flashcards
Modulo operation
The modulo returns the remainder of the division of the number to the left by the number on its right, for example 5 modulo 3 or 5 %% 3 is 2.
Check the data type of a variable
class()
assign value to variable
var <- value
how to create a vector
with the combine function c()
assign names to vector
names()
[2:5] –> which values does this include?
includes the second and fifth value of a vector
Define a new variable based on a selection from a vector
new_var <- some_vector[c(…,3, 4, …)] or [:]
Construct a matrix with 3 rows that contain the numbers 1 up to 9
matrix(1:9, byrow = TRUE, nrow = 3)
Name the columns and rows of a matrix
rownames()
colnames()
calculate sums of rows of matrix or of columns
rowSums() or colSums
Merge matrices and/or vectors together by column (right) or below
cbind() or for below: rbind()
in console, check out contents of workspace
ls()
data on the rows 1, 2, 3 and columns 2, 3, 4.
my_matrix[1:3,2:4]
encode the vector as a factor –> and, optional, also give them an order
factor() –> factor(temperature_vector, order = TRUE, levels = c(“Low”, “Medium”, “High”))
change factor levels of a factor vector to …
levels(factor_vector) <- c(“”, “”) –> the order with which you assign the levels is important. If you don’t specify the levels of the factor when creating the vector, R will automatically assign them alphabetically.
quick overview of the contents of a variable
summary()
see first or last rows of a built-in dataframe
head() or tail()
select a subset based on a certain condition from your dataset
subset(dataframe, condition)
order a vector
order()
Call order() on a dataframe by ordering it based on certain column
dataframe$column
see structure of a dataframe
str()
create a dataframe
data.frame()
add components to a list, then assign names to the components
my_list <- list(your_comp1, your_comp2)
names(my_list) <- c(“name1”, “name2”)
or
my_list <- list(name1 = your_comp1,
name2 = your_comp2)
filter dataframe
filter()
sort the rows of a df based on a positions vector
planets_df[positions, ]
sort values in a dataset
arrange(column_to_use) or arrange(desc(column_to_use)) for descending
pipe
%>%
change values in a dataframe
mutate(what_is_replaced = what_is_calculated)
package for data visualization
ggplot2
create visualization
ggplot(dataset, aes(aesthetic mapping of variables) + type of graph)
function for creating scatterplot with ggplot
geom_point()
to ggplot, add color and size of dots
ggplot(dataset, aes(aesthetic mapping of variables, color = …, size = …) + type of graph)
divide one plot into multiple smaller plots
faceting: facet_wrap(~…)
turn groups into one row each before summarize()
group_by()
after specifying type of graph, also specify log scale
+ scale_x_log10()
turn many rows into one with pipe
… %>% summarize(… = mean(…))
make a line plot
geom_line()
make a bar plot
geom_col()
make a histogram
geom_histogram(binwidth = … or bins = …)
make a boxplot
geom_boxplot()
how are the lines going up and down from the boxplot called?
“whiskers”
add title to ggplot
+ggtitle(“…”)
Data visualization points of consideration
Add a smooth geom to the plot
geom_smooth()