Chapter 9 Flashcards
What is a data warehouse?
A subject-oriented, integrated, time-variant, non-updateable collection of data used in support of management decision-making processes.
(Subject-oriented = customers, patients, etc)
(Integrated = consistent naming conventions formats, encoding sturctures; from multiple data sources)
(Time-variant = can study trends and changes)
(Non-updatable = read-only, periodically refreshed)
What is a data mart?
A data warehouse that is limited in scope
Why is there a need for data warehousing?
- For an integrated, company-wide view of high-quality information (from disparate databases)
- For seperation of operational and informational systems and data (for improved performance)
What are some issues with company-wide view?
- Inconsistent key structures
- Synonyms
- Free-form vs. structured fields
- Inconsistent data values
- Missing data
What organizational trends drive data warehouses?
- There is no single system of records
- They have multiple systems that are not synchronized
- They have an organizational need to analyze activities in a balanced way
- Customer relationship management
- Supplier relationship management
What is an operational system?
A system that is used to run a business in real time, based on current data; also called a system of record
What is an informational system?
A system designed to support decision making based on historical point-in-time and prediction data for complex queries or data-mining applications
Comparison of operational and inofrmational systems
What are the data warehouse architectures?
- Independent data mart
- Dependent data mart and operational data store
- Logical data mart with real-time data warehouse
- three-layer architecture
(all involve some form of extract, transform and load (ETL)
Independent Data Mart visual
What are the limitations of an independent data mart?
- Seperate ETL process for each data mart -> redundant data and processing
- Inconsistency between data marts
- Difficult to drill down for related facts between data marts
- Excessive scaling costs as more applications are built
- High cost for obtaining consistency between marts
Dependent data mart visual
logical data mart visual
Differences in Data warehouse and Data mart
Visual of three-layer data architecture for a data warehouse
What is an event?
a database action (creat/update/delete) that results from a transaction
Status vs event visual
What happens with transient data?
Changes to existing records are written over previous records, thus destroying the previous data content.
What happens with periodic data?
They are never physically altered or delected once they have been added to the store.
Besides transient data coverting to periodic data what other six changes must happen when going from a data mart to data warehouse?
- New descriptive attributes
- New business activity attributes
- New classes of descriptive attributes
- Descriptive attributes become more refined
- Descriptive data are related to one another
- New source of data
What is derived data?
Data that have been selected, formatted, and aggregate for end-user decision support applications