Data Warehous Flashcards
Types of Data Warehousing architecture
1) Centralized (default)
2) Component - Based
Centralized Data Warehouse
Single Database
One stop shopping
2 types of data marts
Dependent - Depend on a datawarehouse
Independent - draws information from a source database
2 variations of ETL
Initial
Incremental
- Normally one time only
- Created right before DW goes live
- All relevant data necessary for BI & analytic
- Redo if DW “blows up”
Initial ETL
What to bring into the DW
- Data definitely needed for BI & analytics
- Data probably needed for BI & analytics
- Historical Data
- periodically “refreshed” the DW
- Adds new data
- Modifies Data
- Special handling for deleted data
Incremental ETL
4 major Incremental ETL patters
- Append (adds new data to end)
- In place update (updates current data with new info)
- Complete Replacement
- Rolling Append (when you add new data is deleted the oldest data)
2 Current MOST USED incremental ETL patters
- Append
- In place Update
2 Data Transformation overarching goals
Uniformity
Restructuring
6 common transformation models
- Data Value Unification
- Data type and size unification
- De-duplication
- Dropping columns (vertical slicing)
- Value-based row filtering (horizontal slicing)
- Correcting Known Errors
In DW
Measurements = ?
Dimensional Context = ?
Facts
Dimensions
4 types of Fact Tables
- transaction
- periodic snapshot
- accumulating snapshot
- factless
Record facts (measurement) from transactions
Transaction Fact Table
Track a given measurement at regular intervals
Periodic Snapshot Table