Chapter 16 Flashcards
Does data transformation also includes data cleansing
Yes
What is ETL cycle
From where we start and then which thing comes on the way through ETL
What is load
The process of writing the data into the target source
What is transform
The process of transforming the extracted data from its original state into a consistent state so that it can be placed into another database.
What is extract
The process of reading data from different sources
What is the first step in ETL
Data extraction
What are types of data extraction
- Logical extraction
- Physical extraction
What are logical extraction types
- Full extraction
- Incremental extraction
What are physical extraction types
- Online extraction
- Offline extraction
- Legacy vs OLTP
What is full extraction
Extract all data from the system with out taking extra information.
What is incremental extraction
Does not drag all data at once, but get data in chunks.
What is online extraction
Keep the system on while extracting data and use a intermediary system for data transformation.
What is offline extraction
Data extraction not from source. First data to be saved in dump files, databases or other sources. and then move to destination.
Please explain OLTP vs legacy systems
Data in hand written sheets. First data entered in system from hand written sheets and then entered in destination.
What are basic steps for data transformation
1- Selection 2- Splitting/joining 3- Conversion 4- Summarization 5- Enrichment (gather data in unified format and insert missing data chunks)
What are data loading strategies
- Data freshness
- System performance (bulk loading is more efficient than frequent loading)
- Data volatility
What are 3 data loading strategies after transformation
- Full data refresh (Load empty dataware house from data, loading speed faster)
- Incremental data refresh (Load dataware house which already loaded)
- Continuous feed (It informs hacking or disaster immediately)