Data Warehouses Flashcards
OLTP
Online Transaction Processing
- ACID properties required
- Details about every transaction
OLAP
Online Analytical Processing
- Data redundancy allowed
- Historical data
- Consistency before and after loads is required
Decision Support Systems (DSS)
Huge operational database
Operational Database vs. Data Warehouse
Operational stores every transaction while data warehouse stores aggregations of events through ETL
ETL
Extraction: Acquires data from sources
(Cleaning): Improves data quality
Transformation: Transforms data to DW format
Loading: Propagates changes
(Loads data to DW)
Data Mart
Small warehouse that focuses on one specific target (ex: one single store)
Dimensional Fact Model
Graphical model supporting conceptual design.
Fact schema modelling:
- Dimensions
- Hierarchies
- Measures
DRAW IT
DFM: Dimension
Attributes of a fact, typically categorical.
DFM: Fact
Event of interest.
Evolves with time.
DFM: Measure
A numerical property of a fact, typically aggregated.
Additive vs. Non Additive
Non additive cannot be aggregated along a given hierarchy by means of the SUM operator
Distributive vs. Non DIstributive (Algebraic)
Algebraic can compute higher level aggregations from more detailed data only when supplementary support measures are available, such as AVG.
Star Schema
Draw it.
Snowflake Schema vs. Star Schema
- Snowflake usually not recommended
- Partitions dimension tables in multiple extra-tables.
SQL: Range vs. Rows
Range counts on the past record, the rows past values (not ideal for data with gaps)