05 Conceptual (and logical) multidimensional modeling Flashcards
List the main steps involved in schema design process for DWH Environments:
-
User requirements and OLTP Schemas are used as inputs into a Conceptual data model
- (M E/R, YAM2, ADAPT).
- A Logical data model follows
- ROLAP, MOLAP, HOLAP, Star Schema, Snowflake Schema
- Finally, a Physical data model is created.
What are the steps to design a Dimensional Model?
4 steps are need to design a Dimensional Model:
-
Choose the business process
- Usually done by business department, business processes are the fundamental building block of the dimensional DWH, each process equals at least one fact table.
-
Declare the grain
- Declare level of detail in the fact table, what does a fact table measurement represents? (a purchase, a transaction, a rental..)
-
Identify the dimensions
- Determine applicable dimensions, dimensions usually come from grain, originally planned 1-dimensions may be two or three at the end.
- Reference + Ratio -> Fact
-
Identify the facts
- Identify facts and measures from the business process.
- Quantity and amount as fundamental facts in transaction-oriented processes.
What is the purpose/mission of a conceptual schema?
- Multidimensional data modeling can play a key role in the design of the dimensional model.
- Assure the flexibility and re-usability of the schema after the user requirements have changed.
- Not assume any facts that are the results of further design steps (DB technology used).
- Semantic relativism: model can accomodate not only one, but many different concepts.
- Enabling the discussion between business and IT about the requirements of information analysis (example: identification and elimination of weaknesses).
Mention key characteristics of OLAP Databases:
-
Dimension with (complex) hierarchies.
- Different levels correspond to different data granularities
- Multiple hierarchies on a single dimension
- Alternative hierarchy paths
-
Multiple cubes (cube architecture).
- Superior cubes
- Analysis of different facts with different set of dimensions
- Summarizability constraints
- Calculated measures, Ratio systems
Why the ER model is not appropiate for modeling Multidimensional Databases? (MDB)
- Distinct from ER, dimensional modeling divides databases into 2 types of objects: fact and dimension tables.
- Dimension tables are de-normalized.
- THe inherent separation of qualifying and quantifying data cannot be expressed in an E/R Model.
- Semantincs of the complex dimension structures are to specific for E/R models.
- ER are closely aligned with a physical implementation that reflects a relational DB approach.
- Therefore: an enhanced ERM (Multidimensional ERM ME/R is proposed)
Mention the graphical elements used to enhance the traditional ER language into a Multidimensiona ERM:
- Idea: slightly enhance the ER language to ensure the flexibility and the simplicity of the ER notification, but allow the definition of hierarchies.
-
Elements added:
- A fact relationship set
- A dimension level set
- A classification relationship set
What are the goals of the Object-Oriented Multidimensional Modeling?
(With YAM2)
- Paradigm of object orientation is the fundament for software engineering.
- The long-term use of UML as modeling and design language has been proven as highly useful.
- As UML is widespread in science and practice, it can be assumed, that there is less adoption effort.
- UML-based multidimensional models can be homogeneously integrated into software engineering.
- UML schema can be enhanced easily by applying specialization techniques.
What are some critics and evaluations of YAM2?
- YAM2 modelling captures all aspects but its criticized as too cumbersome and complex.
- The language is a complete extension of UML for multidimensional modeling.
- UML analytical methods can be used for analyzing purposes.
- Avoid definitions from scratch.
- All elements are specializations of UML constructs.
- YAM2 fulfills almost all requirements for multi-dimensional models
- Explicit separation of structure and contents
- Measure sets, Multi-star schemas, many-to-many relationships
Define ADAPT and describe its 8 basic modeling objects:
- ADAPT = Application Design for Analytical Processing Technologies.
- ADAPT enables the modeler to build manifold dimensional and hierarchical structures.
- Basic modeling objects:
-
Hypercube (cube)
- N-dimensional arrary, in an OLAP DB a hypercube is the basic unit of storage for business data
- Context as a portion of a hypercube that provides a context for analysis.
-
Dimension
- Axis or index of an hypercube
-
Hiearchy
- Set of parent/child member combinations that define aggregation.
-
Level
- Collection of members used to define hierarchical precedence (calendar types).
-
Model
- Algebraic process to calculate derived data. Document the source and target data of a calculation.
-
Member
- An individual dimension value (used as “examples” in models).
-
Attribute
- Information about a dimension member. Applied to dimensions/levels (as “additional information”)
-
Scope
- A (sub-set) collection of dimension members. Difference between calculated and enumerated scopes (examples: “current month”, “year-to-date”).
-
Hypercube (cube)
What are the (5) main ADAPT Connecting Elements:
ADAPT Connecting elements:
- Loose precedence (there are objects not associated to the superior group)
- Strict precedence (every object is associated to exactly one superior object)
- Self precedence
- Used by
- Connector
# **Define the 2 Core ADAPT Objects** and describe its graphical representation: **Hypercube and Dimension.**
-
Hypercube:
- N-dimensional array
- In an OLAP DB, a hypercube is the basic unit of storage for business data.
-
Dimension
- Axis or index of an hypercube
# Define the 2 **ADAPT Hierarchy Objects** and describe its graphical representation: **Hierarchy and Level.**
-
Hierarchy
- Set of parent/child member combinations that define aggregation.
-
Level
- Collection of members used to define hierarchical precedence.
# Define the 2 **ADAPT Dimension Objects** and describe its graphical representation: **Member and Attribute.**
-
Member
- An individual dimension value (“examples” in the model).
-
Attribute
- Information about a dimension member (“additional information”).
- Can apply to both dimension and level.
Define the ADAPT Scope object and describe its graphical representation:
-
Scope
- A collection of dimension members
- Difference between calculated and enumerated scopes
- Other examples: current month, year-to-date.
# Define the 2 **ADAPT Dimension Objects** and describe its graphical representation: **Model and Context.**
-
Model
- An algebraic process to calculate derived data.
- Document the source and target data of a calculation.
-
Context
- A portion of a hypercube that provides a context for analysis.