Article: Introduction to Data warehousing Flashcards
Decision Support System
A set of expandable, interactive IT techniques and tools designed for processing and analyzing data and for supporting managers in decision making. In practice the system provided value estimates for each alternative. Different forms of DSS: - Passive - Active - Collaborative - Model-driven - Etc.
Data warehouse
A collection of data that supports decision-making processes. A Data-warehouse is:
- Subject-oriented
- Integrated
- Consistent
- Shows evolution over time
- Not vollatile
IT systems can be divided into transactional (OLTP) and analytical (OLAP). The type of query required in data warehouses is OLAP.
Data warehousing architectures
- Single layer architecture
- Two-layer architecture (Primary DW or Corporate DW)
- Three-layer architecture
Single layer architecture
Not often used in practice. Goal is to minimize the amount of data stored by removing data redundancies. Composed of a source layer, data warehouse and analytics.
Two-layer architecture
Composed of a source layer, data staging, data warehouse (with meta-data and data marts), and analysis layer.
Three-layer architecture
The third layer is reconciled data layer (or operational data store). This materalizes operational data obtained after integrating and cleansing source data. Main advantage: creates a common reference data model for the whole enterprise. Simultaneously, it sharply separates the problems of source data extraction and integration from those of data warehouse population. However, reconciled data leads to more redundancy of operational source data. This architecture is composed of a source layer, data staging, reconciled layer, loading, DW layer and analysis.
Central architecture vs Federated architecture
- The amount of interdependent information exchanged between business units. When there is a lot of interdependent data, people tend to implement an Enterprise-wide architecture.
- Time available for the implementation. When there is little time, people tend to use a Federated architecture.
- Minor role of data-warehouse played in enterprise strategies. When the role is small, people tend to choose independent data marts.
- The frequent need for integrating pre-existing data-warehouses can require a federated architecture type.
Extract, Transform, Load (ETL)
The ETL process takes place in the staging layer. Process takes place when a data-warehouse is populated for the first time, after that it occurs every time the data-warehouse is regularly updated. 4 different steps:
- Extract
- Cleanse - rectifying data values with rectification and homogenization.
- Transform - managing data formats.
- Load
Multi-dimensional modelling
This concept looks at the measures and dimensions of values of a ‘fact’, and is the basis for the OLAP cube. Measure is a value that is numeric and can be used for calculating. Dimension is associated with hierarchy of aggregation levels, such as date. Using these dimensions and measures, you can perform restrictions (selections) and aggregations.
Meta-data
Data to define other data. Can be classified in two categories:
- Internal meta-data
- External meta-data
Internal meta-data
The users that are interested in internal meta-data are system administrators, because it contains information about data sources, transformation processes, population policies, logical and physical schemata, constraints and user profiles.
External meta-data
The users that are interested in external meta-data are end-users.
After data-warehouse
- Reports
- OLAP
- Dashboards
Reports
A query and a layout -> a query generally implies a restriction and an aggregation of multidimensional data. Data-warehouses add value to reports, because they have one version of the truth (aggregated values are reliable and correct), and performance is improved by separating transaction processes and analysis.
OLAP
A way to explore and analyze the data in the data-warehouse by means of multi-dimensional modelling. The most common operators are roll-up, drill-down, slice-and-dice, pivot, drill-across and drill-through.