Kimball's 4 step Modeling Process Flashcards
What is Kimball 4 step modeling process
1: Select the business process
2: Declare the grain
3: Identify the dimensions
4: Identify the facts
Select business Is described as:
- Business processes are the operational activities performed by your organization
- Business process events generate or capture performance metrics that translate into facts in a fact table.
- Most fact tables focus on the results of a single business process. Choosing the process is important because it defines a specific design target and allows
the grain, dimensions, and facts to be declared. - Each business process corresponds to a row in the enterprise data warehouse
bus matrix.
Declare Grain can be described as:
- The pivotal step in a dimensional design.
- The grain establishes exactly what a single fact table row represents.
- The grain declaration becomes a binding contract on the design.
- The grain must be declared before choosing dimensions or facts because every candidate dimension or fact must be consistent with the grain.
- This consistency enforces a uniformity on all dimensional designs that is critical to BI application performance and ease of use.
- Atomic grain refers to the lowest level at which data is captured by a given business process.
- Strongly encourage you to start by focusing on atomic-grained data because it withstands the assault of unpredictable user queries; rolled-up summary grains are important for performance
tuning, but they pre-suppose the business’s common questions. - Each proposed fact table grain results in a separate physical table; different grains must not be
mixed in the same fact table.
Dimesons for descriptive product can be described as:
- Dimensions provide the “who, what, where, when, why, and how” context surrounding a business process event.
- Dimension tables contain the descriptive attributes used by BI applications for filtering and grouping the facts. With the grain of a fact table firmly in mind, all the possible dimensions can be identified. Whenever possible, a dimension should be single valued when associated with a given fact row.
- Dimension tables are sometimes called the “soul” of the data warehouse because they contain the entry points and descriptive labels that enable the DW/BI system to be leveraged for business analysis. Disproportionate
amount of effort is put into the data governance and development of
dimension tables because they are the drivers of the user’s BI experience.
Facts for Measurements includes:
- Facts are the measurements that result from a business process event
and are almost always numeric. - A single fact table row has a one-to-one relationship to a
measurement event as described by the fact table’s grain. Thus a fact
table corresponds to a physical observable event, and not to the
demands of a particular report. - Within a fact table, only facts consistent with the declared grain are allowed.
What are the steps for building a data warehouse from a normalized database?
- Develop a normalized entity-relationship business model of the data warehouse.
- Translate this into a dimensional model. This step reflects the
information and analytical characteristics of the data warehouse. - Translate this into the physical model. This reflects the changes
necessary to reach the stated performance objectives.
What are the steps for dimensional modeling?
- Select an associative entity for a fact table
- Determine granularity
- Replace operational keys with surrogate keys
- Promote the keys from all hierarchies to the fact table
- Add date dimension
- Split all compound attributes
- Add necessary categorical dimensions
- Fact (varies with time) / Attribute (constant)
How do you convert an E-R Diagram
- Determine the purpose of the mart
- Identify an association table as the central fact table
- Determine facts to be included
- Replace all keys with surrogate keys
- Promote foreign keys in related tables to the fact table
- Add time dimension
- Refine the dimension tables
Describe fact Tables
Represent a process or reporting environment that is of value to the
organization
- It is important to determine the identity of the fact table and specify exactly what it represents.
- Typically correspond to an associative entity in the E-R model
Describe Grain (unit of analysis)
The grain determines what each fact record represents: the level of
detail.
- For example:
- Individual transactions
- Snapshots (points in time)
- Line items on a document
- Generally better to focus on the smallest grain
Describe Facts
Measurements associated with fact table records at fact table granularity
- Normally numeric and additive
- Non-key attributes in the fact table
- Attributes in dimension tables are constants. Facts vary with the granularity of
the fact table
Describe Dimensions
A table (or hierarchy of tables) connected with the fact table with keys
and foreign keys
- Preferably single valued for each fact record (1:m)
- Connected with surrogate (generated) keys, not operational keys
- Dimension tables contain text or numeric attributes