CSE 6040 - 7.0 - Tidy Data Concepts Flashcards
What is Tidy Data?
A variable forms a what?
Each observation forms a what?
Each observational unit forms a what?
Tidy data is standard way of mapping the meaning of a datset to its structure. In tidy data
- Each variable forms a column.
- Each observation forms a row.
- Each type of observational unit forms a table
A Data Frame representation is better suited for __________.
A Data Frame representation is better suited for regression.
Wickham defines a tidy data set as one that can be organized into a 2-D table such that
each column represents a variable;
each row represents an observation;
each entry of the table represents a single value, which may come from either _______(discrete) or ________spaces.
Wickham defines a tidy data set as one that can be organized into a 2-D table such that
each column represents a variable;
each row represents an observation;
each entry of the table represents a single value, which may come from either categorical (discrete) or continuous spaces.
If a table is tidy, we will call it a ___ __ or _______, for short.
If a table is tidy, we will call it a tidy table, or tibble, for short.
Identify how a computer scientist with machine learning outlook might refer to to this picture.
Columns as features
rows as data points
especially when all values are numerical (ordinal or continuous)