Reshaping, joining and visualisation data Flashcards
week 2
function to do summarising and filtering by subgroups of the data.
group_by()
There are two main types of join
Mutating joins add new columns from the additional dataset.
Filtering joins filter out existing columns based on information in the additional dataset.
There are four separate mutating joins
left_join(x,y) returns all rows in x, and all rows that match these in y. If y doesn’t have a match, NA will be used.
right_join(x,y) returns all rows in y and all rows that match these in x. If x doesn’t have a match in y, NA will be used.
inner_join(x,y) returns only rows in x and y that have a match.
full_join(x,y) returns all rows in x and all rows in y. Anything in x or y that don’t have a match will be filled with NA.
There are three interrelated rules which make a dataset tidy
- Each variable should be a column.
- Each observation should be a row.
- Each value should have its own cell