R Flashcards

1
Q

argument

A

(r) information that a function needs in order to run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

variable

A

representation of a value in R that can be stored for use later during programming (can also be called OBJECT)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

vector

A

a group of data elements of the same type stored in a sequence in R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pipe

A

a tool in R for expressing a sequence of multiple operations, represented with “%>%”; takes the output of one statement and makes it the input of the next statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The 4 types of Vectors

A

logical (TRUE, FALSE), character (words), integer (1L, 2L, 3L), double (2.5, 4.561)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

create a data frame

A

data.frame(x=c(1,2,3), y=c(1.4, 5.4, 10.4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

create a new folder

A

dire.create (“destination_folder”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

create a file

A

file.create(“new_word_file.docx”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

copy a file

A

file.copy (“new_text_file.txt”, “destination_folder”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

OR operator

A

I or II

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

NOT operator

A

!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

common function to preview data (1st 6 rows)

A

head()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

these functions return summary - high level view of each column in your data arranged horizontally

A

str()- horizontal summary, and glimpse()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

function for returning a list of column names from dataset

A

colnames()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

renaming a column

A

rename(diamonds, carat_new = carat, cut_new = cut)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

summarizing your data

A

summarize(diamonds, mean_carat = mean(carat))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

separates plots by a charactaristic

A

+ facet_wrap(~cut)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

code for using diamonds dataset, plotting x axis carat, , y axis price, and dots are colored differently for different cuts, scatter plot, different plots for different cuts

A

ggplot(data = diamonds, aes(x = carat, y = price, color = cut)) +
geom_point() +
facet_wrap (~cut)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

packages (R)

A

units of reproducible R code

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

vignette

A

documentation that acts asa guide to an R package

browseVignettes()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

filter by vitamin c dose 0.5

A

filtered_tg

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

sort by tooth length (after a filter)

A

arrange(filtered_tg, len)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Pipe operator shortcut

A

ctrl + shift + m

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

switch between a date-time to a date

A

as_date() (in the lubridate package)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

data frame

A

collection of columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

tibbles

A

dataframes in the tidyverse you can’t change the type of info (number - string)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

how to add a column to a dataframe

A

mutate(dataframe, column_new = column*100)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

install tidyverse

A

install.packages(“tidyverse”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

after you’re done installing tidyverse, what is the next step?

A

load it: library(tidyverse)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Tibbles

A

only pull up first 10 rows of a dataset.
Never change the names of your variables,
or the data types of your inputs.
Part of tidyverse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

how to read a csv file

A

read_csv()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

import “hotel_bookings.csv” into R and save it as a data frame titled ‘bookings_df’

A

bookings_df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

if you want to create another (smaller) data frame from the existing dataframe (for example wit hthe “adr” and “adults” columns of the bookings_df dataframe).

A

new_df

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

add a column to the dataframe: total = adr/adults

A

mutate(new_df, total= ‘adr’/adultsread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

skimr package

A

makes summarizing data really easy, lets you skim through it more quickly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

janitor package

A

has functions for cleaning data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

functions to get summaries of our dataframes

A

skim_without_charts(), glimpse(), head(), str(), select()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

packages that simplify data cleaning tasks

A

skimr and janitor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

select()

A

specifies certain columns or excludes columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

if you want all the columns in the penguins dataset EXCEPT the species column

A

penguins %>%

select( - species)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

rename a column (in penguins dataset)

A

penguins %>%

rename(island_new = island)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

make all columns uppercase (or lowercase)

A

rename_with(penguins, toupper) (or tolower)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

clean_names()

A

ensures only characters, numbers and underscores in the names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

%%

A

returns remainder after division

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

%/%

A

returns an integer value after division (5%/%2=2)

46
Q

4 kinds of operators

A

arithmetic, relational, logical, assignment

47
Q
A

exponent

48
Q

equal to

A

==

49
Q

not equal to

A

!=

50
Q

&&

A

compares only first numbers in the vectors (x

51
Q

!

A

logical NOT

52
Q

arrange()

A

chooses what variable you want to sort by

53
Q

sort by bill length (penguins) in descending order

A

penguins %>%

arrange( - bill_length)

54
Q

create a dataframe

A

assigning a name to something

55
Q

view dataframe

A

View()

56
Q

putting similar values together in a column

A

group_by()

57
Q

leave the missing values out

A

drop_na( )

58
Q

Get averages (or max values) of bill length per island penguins

A

penguins %>% group_by(island) %>% summarize (mean_bill_length_mm = mean (bill_length_mm))

(or replace mean with max)

59
Q

get max and mean bill length for each species by island.

A

penguins %>% group_by(species, island) %>%

summarize(max_bl=max(bill_length_mm), mean_bl = mean(bill_length_mm)

60
Q

only view Adelie penguins

A

penguins %>%

filter (species == Adelie)

61
Q

data cleaning packages

A

install.packages(tidyverse, skimr, janitor

62
Q

import and save csv file “hotel bookings” as a dataframe

A

bookings_df

63
Q

view only certain columns from a dataframe

A

trimmed_df

64
Q

cleaning functions

A
  1. rename: (to rename columns)

dataframe %>%
rename(column_new = column)

  1. unite:

dataframe %>%
unite (column1_2, c(“column1”, “column2”), sep =
“ “)

  1. mutate: (adds a column)

dataframe %
mutate(guests = babies+children+adults)

  1. summarize (newcolumn= mean(column),
    newcolumn1 = sum(column1)
65
Q

transform data with these functions

A

separate( )
unite ( )
mutate ( )

66
Q

separate( ) syntax

A

separate( dataframe, column, into = c(newcolumn1, newcolumn2), sep = “ “)

67
Q

unite( ) syntax

A

unite (dataframe, “newcolumn”, column1, column2,

68
Q

mutate( ) syntax

A

dataframe %>%

mutate(new_column = column/1000, new_column2 = column2/1000)

69
Q

Convert data from wide to long or long to wide

A

pivot_longer( ), pivot_wider( )

70
Q

makes sure column names are unique and consistent

A

clean_names( )

71
Q

bias function (package, syntax)

A

SimDesign package, bias(actual, predicted)

72
Q

sort hotel_bookings columns by lead time (most to least)

A

arrange(hotel_bookings, desc(lead_time))

73
Q

how to find max & min lead time in hotel_bookings

A

max(hotel_bookings$lead_time)

min (hotel_bookings$lead_time)

74
Q

average lead time in hotel_bookings

A

mean(hotel_bookings$lead_time)

75
Q

Filter syntax into a “new_hotel_dataframe”

A

new_hotel_dataframe

76
Q

find min/max/mean lead times at the two hotels, call it “hotel_summary”

A

hotel_summary %
group_by (hotel) %>%
summarise (average_lead_time = mean(lead_time)
max_lead_time = max(lead_time)
min_lead_time = min (lead_time)

77
Q

functions that let you change your data

A

arrange( ), group_by( ), filter( )

78
Q

making columns lower (or upper)case

A

rename_with(dataframe, tolower)

79
Q

core concepts in ggplot2

A

aesthetics, geoms, facets, labels, and annotations

80
Q

view palmerpenguins dataset

A

install.packages(“palmerpenguins”)
library(“palmerpenguins”)
data(penguins)
View(penguins)

81
Q

two different geoms

A

geom_point and geom_bar

82
Q

geom_point argument for flipper length as xaxis, and body mass g as yaxis

A

ggplot(data=penguins) +
geom_point(mapping = aes(x=flipper_length_mm,
y=body_mass_g))

83
Q

geom

A

a geometric object used to represent your data (points, bars, lines and more)

84
Q

aesthetic

A

a visual property of an object in your plot (position, color, shape or size)

85
Q

mapping

A

matching up a specific variable in your dataset with a specific aesthetic

86
Q

3 steps to plot a graph

A
  1. start with ggplot function and choose a dataset
  2. add a geom_ function to display your data
  3. map the variables you want to plot in the arguments of the aes( ) function
87
Q

what other aesthetics can you add to variables

A

x,y, color, shape, size, alpha (transparency)

88
Q

this geom shows general trends in data

A

geom_smooth

89
Q

this aesthetic breaks out geom_smooth into pieces

A

linetype = (species)

90
Q

this geom creates a little noise around each point

A

geom_jitter

91
Q

When using geom_bar, the color aesthetic will…

A

only put outlines of the color around the bars, the “fill” aesthetic will fill in the color

92
Q

data smoothing for plots with less than 1000 points

A

ggplot(data, aes(x= , y= )) +
geom_point() +
geom_smooth (method = “loess”)

93
Q

data smoothing for plots with more than 1000 points

A

ggplot(data, aes(x= , y= )+
geom_point() +
geom_smooth (method = “gam”, …)

94
Q

facets

A

let you display smaller groups, or subsets, of your data

95
Q

2 types of facets

A

facet_wrap, facet_grid

96
Q

facet_wrap(~species)

A

let’s us create a separate plot for each species

97
Q

allows you to facet your plot with two variables;

A

facet_grid

vertically by the first variable, and horizontally by the second variable

98
Q

~

A

tilda symbol

99
Q

what rotates text 45 degrees to make it easier to read?

A

theme(axis.text.x = element_text(angle = 45)

100
Q

how to add a label

A

labs(title=”Palmer Penguins”, subtitle=”3 Species”, caption = “collected by Dr.”)

101
Q

text INSIDE the grid of the plot

A

annotate function

102
Q

“annotate” function syntax with font, size and tilt

A

annotate(“text”, x=50, y=50, label= “The largest”, fontface=”bold”, size=4.5, angle=25)

103
Q

how to save a plot (2 ways)

A
  1. Explort

2. ggsave(“—.png”)

104
Q

find earliest year in hotel_bookings

A

min(hotel_bookings$arrival_date_year)

105
Q

paste0

A

subtitle=paste0(“Data from: “, mindate, “ to “, maxdate))

106
Q

ggsave syntax

A

ggsave(“—.png”, width=7, height=7)

107
Q

R Markdown

A

file format for making dynamic documents with R

108
Q

Markdown

A

a syntax for formatting plain text files

109
Q

R Notebook

A

lets users run your code and show tha graphs and charts that visualize the code

110
Q

HTML

A

The set of markup symbols or codes used to create a webpage