Extract-Transform-Load Flashcards
1
Q
In Extract-Transform-Load, what does extract refer to?
A
Taking data from the format you find it and and reading it so you can work with it
2
Q
In Extract-Transform-Load, what does transform refer to?
A
Any steps required to get the data into a useful format
3
Q
In Extract-Transform-Load, what does load refer to?
A
Saving the workable data then loading it into the next pipeline step
4
Q
What is Extract-Transform-Load?
A
Getting your data into a nice format so it can be used
5
Q
What are some things you might do as part of the transform step?
A
Fix types Split strings Aggregate data Pivot data Filter data Join data De-identification Clean data
6
Q
What are some options to save your data in?
A
CSV, JSON not efficient
Databases
HDF5, optimized for data sets
Parquet