Data Warehousing - Data Modeling Flashcards

Question

What are the primary functions of a dimension?

Answer 1

The primary functions of dimensions are to provide: 1) Filtering 2) Grouping 3) Labeling These three functions are also called: 'Slice and Dice' Slicing is the filtering of data, Dicing is the grouping of data.

Answer 2

A set of data attributes that change slowly over a period of time rather than changing regularly e.g. address or name. These attributes can change over a period of time and that will get combined as a slowly changing dimension. These dimension can be classified into several types: * Type 0 (Retain original): Attributes never change. No history. * Type 1 (Overwrite): Old values are overwritten with new values for attribute. No history. * Type 2 (Add new row): For a new value, a new row is created with either a start date / end date or version. This creates a history. * Type 3 (Add new attribute): For a new value, a new columm is created. History is limited to the number of columns designated for storing historical data. * Type 4 (Add history table): One table keep the current value, where as the history is saved in a second table. This creates a history. * Type 5 (Combined Approach 1 + 4): Combination of type 1 and type 4. History is created through a second history table. * Type 6 (Combined Approach 1 + 2 + 3): Combination of type 1, type 2 and type 3. History is created through separate row and attributes. * Type 7 (Hybrid Approach): Both surrogate and natural key are used.[4]

Answer 3

A conformed dimension is a dimension that has the same meaning to every fact with which it relates. A conformed dimension is a set of data attributes that have been physically referenced in multiple database tables using the same key value to refer to the same structure, attributes, domain values, definitions and concepts. A conformed dimension cuts across many facts. Dimensions are conformed when they are either exactly the same (including keys) or one is a proper subset of the other. Most important, the row headers produced in two different answer sets from the same conformed dimension(s) must be able to match perfectly.' Conformed dimensions are either identical or strict mathematical subsets of the most granular, detailed dimension. Dimension tables are not conformed if the attributes are labeled differently or contain different values. Conformed dimensions come in several different flavors. At the most basic level, conformed dimensions mean exactly the same thing with every possible fact table to which they are joined. The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.[5]

Answer 4

A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables. The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators. These kinds of attributes are typically remaining when all the obvious dimensions in the business process have been identified and thus the designer is faced with the challenge of where to put these attributes that do not belong in the other dimensions.

Answer 5

Degenerate dimensions, also called fact dimensions, are standard dimensions that are constructed from attribute columns in fact tables instead of from attribute columns in dimension tables. This is because useful dimensional data is sometimes stored in a fact table to reduce duplication, especially when you have a very large fact table. - is a key (like invoice #) that has no attributes and does not join to a dimension table - common when the grain of a fact table represents a single transaction item or line item because the degenerate dimension represents the unique identifier of the parent.

Answer 6

Role-playing dimensions are dimensions that are recycled for multiple use, like Date (created, hired, etc). Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension". This can be implemented using a view over the same dimension table.

Answer 7

As per Kimball, when a dimension in the DWH is linked with another dimension table, the secondary table is call an outrigger dimension. For example, a bank account dimension can reference a separate dimension representing the date the account was opened. These secondary dimension references are called outrigger dimensions. Outrigger dimensions are permissible, but should be used sparingly. In most cases, the correlations between dimensions should be demoted to a fact table, where both dimensions are represented as separate foreign keys. - Usually dimension tables do not reference other dimensions via foreign keys. When this happens, the referenced dimension is called an outrigger dimension. - Outrigger dimensions should be considered a data warehouse anti-pattern: it is considered a better practice to use some fact tables that relate the two dimensions

Answer 8

A shrunken dimension entity is a perfect subset of a more detailed, granular dimension entity. In this case, the attributes that are common to both the detailed and shrunken subset dimension have the same attribute names, definitions, and domain values. - Conformed dimensions are said to be a shrunken dimension when it includes a subset of the rows and/or columns of the original dimension.

Answer 9

A special type of dimension can be used to represent dates with a granularity of a day. Dates would be referenced in a fact table as foreign keys to a date dimension. The date dimension primary key could be a surrogate key or a number using the format YYYYMMDD. The date dimension can include other attributes like the week of the year, or flags representing work days, holidays, etc. It could also include special rows representing: not known dates, or yet to be defined dates. The date dimension should be initialized with all the required dates, say the next 10 years of dates, or more if required, or past dates if events in the past are handled. Time instead is usually best represented as a timestamp in the fact table. Calendar date dimensions are attached to virtually every fact table to allow navigation of the fact table through familiar dates, months, ﬁscal periods, and special days on the calendar. You would never want to compute Easter in SQL, but rather want to look it up in the calendar date dimension. The calendar date dimension typically has many attributes describing characteristics such as week number, month name, ﬁscal period, and national holiday indicator. To facilitate partitioning, the primary key of a date dimension can be more meaningful, such as an integer representing YYYYMMDD, instead of a sequentially-assigned surrogate key. However, the date dimension table needs a special row to represent unknown or to-be-determined dates. If a smart date key is used, filtering and grouping should be based on the dimension table’s attributes, not the smart key. When further precision is needed, a separate date/time stamp can be added to the fact table. The date/time stamp is not a foreign key to a dimension table, but rather is a standalone column. If business users constrain or group on time-of-day attributes, such as day part grouping or shift number, then you would add a separate time-of-day dimension foreign key to the fact table.

Answer 10

Fact tables are the core tables of a data warehouse. They contain quantitative information, commonly associated with points in time. They are used in trends, comparisons, aggregations, and groupings. They feed analysis and visualization tools to allow insights to be discovered about the functional area. Dimensions, on the other hand, are collections of reference information about the facts in a data warehouse. Dimensions categorize and describe the facts recorded in a data warehouse to provide meaningful, categorized, and descriptive answers to business questions.

Data Warehousing - Data Modeling Flashcards

(34 cards)