Data Flashcards
1
Q
TIDY data
A
Filter Transform Aggregate Sort Join/Merge (Inner/Outer)
2
Q
Common Data Problems
A
In different source systems Messy: Missing Invalids Errors Different levels
3
Q
Common Data Types
A
Flat File
-Not used as much now
-Each field is placed in a fixed position (e.g. first 5 bytes of the file)
CSV
-Values are separated by commas (very common as many systems can export a CSV
Delimited File
-Pipe (example is Air BnB)
-Tab, etc.
Proprietary (e.g. SAS, SPSS, Workday etc.)