Five Key Dplyrs Flashcards

1
Q

Filter()

A

selects rows based on logical conditions.

Filter “Badu, Erkyah” and “Nas” in Billboard
billboard[1:5] |>
filter(artist == “Badu, Erkyah” | artist == “Nas”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Select()

A

selects columns based on their names.

Subset “artist” and “track” columns
billboard |>
select(artist, track) |>
head()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

mutate()

A

generates new columns by applying functions to existing columns.

Compute weeks on Billboard chart
billboard <- billboard |>
rowwise() |>
mutate(n_notNA = sum(!is.na(c_across(contains(“wk”))))) |>
ungroup()
hist(billboard$n_notNA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

summarize()

A

gives summaries, with group_by() for grouped summaries.

Median weeks on Billboard chart
billboard |> summarize(median_wks = median(n_notNA))

Artists with multiple Billboard songs
billboard[1:3] |>
group_by(artist) |>
filter(n() > 1) |>
count(sort = TRUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

arrange()

A

sorts rows by column values. Use desc() for reverse order

Sort by first entry date
billboard[1:3] |> arrange(date.entered)

Longest staying artist and song
billboard[c(1, 2, 80)] |> arrange(desc(n_notNA))
billboard[c(1, 2, 80)] |> filter(n_notNA == max(n_notNA))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pivot_longer()

A

Converts data from wide to long format by combining multiple columns into two: Variable and Values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pivot_wider()

A

Converts from long to wide format by spreading variable -value pairs into separate columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

|> or %>%

A

Pipe operator, step-by-step flow in data manipulation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

group_by()

A

To create a “group” copy of a table grouped by columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Count()

A

Count number of rows in each group defined by the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

distinct()

A

Remove rows with duplicate values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly