dplyr Flashcards

1
Q

a new type of data frame that makes data more easy to work with

A

tbl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

function to create a tbl

A

tbl_df(mydata)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

utilize lookup tables to clean data (similiar to case when)

A

two

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

glimpse

A

shows some data from every column in a tbl

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the 5 main data manupulation funcion in dyplr

A

select, mutate, arrange, filter, summarise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

arrange()

A

that reorders the rows according to single or multiple variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

summarise()

A

which reduces each group to a single row by calculating aggregate measures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

select()

A

select(data, column1, column 2, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

starts_with(“X”)

A

returns every column that starts with “X”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ends_with(“X”)

A

returns every column that ends with “X”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

contains(“X”)

A

returns every column that contains “X”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

matches(“X”)

A

returns every column that matches “X”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

“Not equal to” operator

A

!=

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

equal to operator

A

==

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

%in%

A

used in the filter clause as a logical operator to test whether a variable is found in a vector.

ex. filter(grades, %in% c(‘a’, ‘b’)) will return only grades that were an a or b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

you can combine logical operators in the filter clause

A

&, |, !=, ==

17
Q

Dplyr Function. The number of rows in the data.frame or group of observations that summarise() describes.

A

n()

18
Q

Dplyr Function. The number of unique values in vector x.

A

n_distinct()

19
Q

how can you use dplyr to calculate the proportion of observations that pass a logical test

A

first use some logical test, this will return a vector of TRUE and FALSE, then use the mean() function on that returned vector. R will coerce the logical vector into 1 for TRUE and 0 for FALSE. Thus, taking the mean of this vector will return the proportion of TRUE observations.