Data Modeling Flashcards
Consolidated Fact Tables
Combined Fact Tables that have a similar grain.
ex. Sales Actual & Sales Forecast
Star Schema
Normalized fact table surrounded by denormalized dimension tables
5 Way to handle Early Arriving Fact / Late Arriving Dimension
- Never process
- Send record to error log
- Park and retry later
- Insert dummy value
- Insert dimension and update fact later
SCD Type 0
Fixed, never changes
ex. Social Security Number, Hire Date
Slightly Ragged / Variable Depth Hierarchies
Doesn’t have a fixed number of levels, but range should be shallow
ex. Geography hierarchies range from 3 to 6 levels
Degenerate Dimensions
Dimensions that have no context except for it’s primary key
Placed in fact table, not it’s own dimension table
Ex. Invoice line item numbers
Ragged / Variable Depth Hierarchies
Unknown depth
Use bridge table for every possible path to join to proper hierarchy
SCD Type 5
- The type 5 technique builds on the type 4 mini-dimension by embedding a “current profile” mini-dimension key in the base dimension that’s overwritten as a type 1 attribute
- This approach, called type 5 because 4 + 1 equals 5, allows the currently-assigned mini-dimension attribute values to be accessed along with the base dimension’s others without linking through a fact table.
Customer Key (PK)
Customer ID (NK)
Customer Name
…
Currrent Profile Key (FK)
Current Profile Key (PK)
Current Age Band
Current Score
Current Income Level
Accumulating Snapshot Fact
Stores what happened over period of time
Ex. Sales Order that updates columns (ship date, received date) that updates columns as it progresses
Three Types of Fact Tables
- Transactions
- Periodic Snapshot
- Accumulating Snapshot
SCD Type 3
Add a previous column value to the dimension table
Slowly changing dimension type 3 changes add a new attribute in the dimension to preserve the old attribute value; the new value overwrites the main attribute as in a type 1 change. This kind of type 3 change is sometimes called an alternate reality. A business user can group and filter fact data by either the current value or alternate reality. Rarely used.
Multiple Hierarchies in Dimensions
Multiple grouping levels
Ex. Dates: year, month, day
geography: Country. region, city
category: subcategory, line item
Conceptual Data Model
What the system will contain
Typically created by business stakeholders
Entities, attributes and relationships
Logical Data Model
How the system is implemented
Done by architects and analysts
Date Dimensions
Static dimensions that are generally loaded into the system to be used to create variations of the date
Can also be boolean is_holiday, is_weekend
Outrigger Dimensions
Dimension table that is reference by another dimension table
Happens with SCD Type 4 (dimension -> mini dimension current)
Conformed Dimensions
Single dimension table used in multiple fact tables
ex. date dimension, product dimension, store dimension (address dim used for store location + where customer order shipped to)