Question 4 Flashcards
What is a Star Schema?
The star schema consists of a central fact table that stores transactional or measured data. Surrounding the fact table are one or more dimension tables, which are linked to the fact table through ER (Entity-Relationship) connections. The fact table contains the measurements or numerical data, while the dimension tables describe these facts by answering key questions such as “Who,” “What,” “Where,” “When,” and “How.”
What are 4 performance techniques used in a Star Schema?
- Normalizing Dimensional Tables: Simplifies semantic representation and user navigation through the data by normalizing the dimension tables.
- Maintaining Multiple Fact Tables: Stores fact tables at different aggregation levels (e.g., region, state, city) to speed up query operations.
- Denormalizing Fact Tables: Improves performance by storing data more efficiently, reducing the number of records needed to be accessed during queries.
- Partitioning and Replicating Tables: Improves data access by splitting tables into subsets and replicating them across different locations.
What is OLAP?
Online Analytical Processing (OLAP): Provides tools for multidimensional data analysis, supporting decision making, business modelling, and operations research.
What are the 3 OLAP characteristics?
- Multidimensional Data Analysis: Advanced data presentation, aggregation, computation, and modelling functions for business analysis.
- Advanced Database Support: Provides access to various DBMSs, aggregated data, and rapid query responses with advanced navigation.
- User-Friendly Interfaces: Offers an analytical interface to simplify and accelerate decision-making processes.
What is business intelligence?
A comprehensive, cohesive, and integrated set of tools and processes used to capture, collect, integrate, store, and analyse data to generate and present information to support business decision-making.
Differentiate between operational data and decision support data.
- Time span: Operational data (OD) covers a short time frame, while decision support data (DSD) tends to cover a longer time frame.
- Granularity: Describes the levels of aggregation. Decision support data must be presented at various levels of aggregation, from highly summarized (DSD) to nearly atomic (OD).
- Dimensionality: Operational data focuses on representing individual transactions, rather than the effect of those transactions over time (DSD). Decision support data includes multiple dimensions to show how data relates across those dimensions.