exam 2 Flashcards

1
Q

combines rows from 2 datasets where there’s a match btw the specified variables

  • rows with no matching values are excluded
  • returns results if the keys are matched in both tables
A

inner join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

includes all rows form the left dataset and the matching rows form the right dataset. If there is no match, the columns from he right dataset will be filled with NA. Her e the rows of the first tables are always returned, regardless of whether there is a match in the second table

A

left join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

opposite of left join. Includes all rows from the right dataset and the matching rows from the left dataset

A

right join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

includes all rows from both datasets. columns from the dataset with missing values will be filled with NA where there is is no match

A

full join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

refers to a specific way of organizing data tables in a tabular format to facilitate data analysis. In tidy data:

  • each variable forms a column
  • each observation forms a row
A

tidy data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

data often comes in various formats and its structure might not be ideal for the task at hand. Pivoting helps you reorganizing your data to format that makes it easier to analyze, visualize, or model

A

pivots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

used to convert data from a wide format into. along format. its particularly useful when you have variables spread across different columns and you want to stack them into a single column, often with corresponding values

A

pivot longer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

inputs data frame

A

wide_dataframe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

used to convert data from a long format to a wide format. useful when you want to take distinct values from a column and spread them across new columns

A

pivot_wider

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how data variables map to plot aes like position, color shape

A

aes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

visual elements to represent the data (lines, points, bars, etc)

A

geometric objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

splitting Fata into subplots based on a variable

A

facets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

controlling the overall appearance of the plot

A

theme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

the smaller the bandwidth the ____ the peaks (more or less)

A

more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

position = “dodge” does what

A

groups bars together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

makes a line in a graph

A

geom_smooth

17
Q

is color or fill used in scatterplots

18
Q

is color or fill in bar graphs

19
Q

deal with one variable

A

facet_wrap

20
Q

deals with 2 variables

A

facet_grid

21
Q

detects patterns – checks if a strong contains a specific pattern. output : T/F

A

str_detect

22
Q

finds the length (# of characters) in a string)

A

str_length

23
Q

trims white spaces from the beginning and the end of the string

24
Q

remove extra white space within string, as well as the beginning + end

A

str_squish

25
Q

extracts substrings from a string

26
Q

replaces a pattern with another string

A

str_replace

27
Q

concatenates strings together (brings them together)

28
Q

splits a string into a character vector using a specified delimiter

29
Q

groups data into rows with all the variables

30
Q

is a function that works like $. It calls a specific column of your dataframe