Basic Data Wrehousing and Architectures Flashcards
-a central goal you are trying to achieve
theme
-a group of data elements tat are central to achieving that goal
critical success factor
- a specific question that can be tied to data to identifyif the critical success factoe is being met ot not.
business questions
- Select the business process to model
- Declare the grain of the business process
- Choose the dimensions that apply to each fact table row.
4.Identify the facts
–is part science, part art
Dimensional Modeling Process
select business process to model
-specify exactly what an individual fact table row represents-
-conveys the level of detail associated with fact table measurements.
-it is highly recommended to choose the most granular or atomic information captured by the business process.
declare the grain of the business process
-determine the ways the data will be aggregated or filtered.
-identify the level of hierarchy associated with each part of the grain
choose the dimensions
-determine the measurements that are available at the chosen grain
-identify any consolidations, calculations or conversions to be done.
identify the facts
-all measurements must be at the sam grain
-contains two or more foreign keys to dimension tables
-expresses the many-to-many relationships between dimensions in dimensional models.
-primary table which stores the performance measurements of the business.
-each measurement is taken at the intersection of all relevant dimensions
fact table
-look at the OLTP schema (or availbale extracts) to determine identify possible measures.
-determine the lowest grain possible
design guidelines
-contain the textual dscriptors of the business
-usually low in cardinality, but very wide
-dimensionattributes used as queart constraints, groupings, and report labels.
-the more descriptice the dimension attributes, the better
-often contain hierarchical relationships
dimension tables
(benefits of dimensional model)
-easy for business users to understand.
-improved query performance
simplicity
(benefits of dimensional model)
-easily accommodates change
extensibility
gross profit/salesdollar amount
gross margin
sales dollar amount-cost dollar amount
gross profit
a fact is …if we can sum the fact across all dimensions and obtain a valid and correct number
additive
a fat is … if the summation of the fact across any dimension results in a meaningless, nonsensical number
nonadditive
-a fact is…if it is additive across some dimensions and nonadditivie across other dimensions
semiadditive
-it is highly recommended to use this for dimension table keys.
-are simply integers assigned sequentially to a particular dimension row.
surrogate keys
-are frequently source natural keys and used to determine the surrogate key to use.
-are also retained for analysis purposes
operational codes
buffer the data warehouse from changes in operational codes,
-can save space due to their small size comapared to operational codes.
-allow recording of conditions which do not have an operational code
-allow handling of changes to diemnsion table attributes
benefits of surrogate keys
-that extracts data from legacy systems and external sources, consolidates and summarizes them, and loads them into the data warehouse.
data acquisition (back-end) software
-that allows users to access and analyze data from the warehouse
client (front-end) software