Chapter 4 Test Deck Flashcards
Data Warehouse
A subject-oriented, integrated, time-variant, and nonvoliatile collection of data in support of manament’s decision-making process.
Data warehousing
The process of constructing and using data warehouses
Subject-Oriented
organized around Major subjects such as customer, product, sales
Focusing on modeling and analysis of data for decision makers
(OLTP - Online Transaction Processing)
Integrated
Constructed by integrated multiple, heterogeneous data sources
relational databases, flat files, online transaction records
data cleaning and data integration techniques are applied
TIme variant
Time horizon significantly longer than operational systems: current value data
data warehouse data: provide info from historical perspective 5-10 years
summarized and historical records, patterns
Contains an element of time, explicitly or implicitly
Nonvolatile
A physically separate store of data from the operational environment
Update of data does not occur in the data warehouse environment
not require transaction processing, recovery, and concurrence control mechanisms
requires initial loading of data and access of data
OLAP- Online Analytical Processing
OLTP
Clerk, IT professional day to day operations application-oriented E-R model current, up-to-date detailed repetitive read/write short simple transaction tens thousands 100MB-GB transaction throughput
Olap
knowledge worker decision support subject-oriented star schema historical multidimensional ad-hoc lots of scans complex query millions hundreds 100Gb-TB query throughput, response
DBMS
Tuned for OLTP access methods, indexing, concurrency control, recovery
warehouse
tuned for OLAP: complex OLAP queries, multidimensional view, consolidation
Enterprise warehouse
collects all of the information about subjects spanning the entire organization
Data Mart
A subset of corporate-wide data that is of value to a specific group of users. Scope confined to specific, selected groups, such as marketing data market. Independent vs dependent data marts
virtual warehouse
A set of views over operational databases
only some of the possible summary views may be materialized
Data extractiom
Get data from multiple, heterogenous, and external sources
Data cleaning
detect errors in the data and rectify them when possible
Data Transformation
Convert data from legacy or host format to warehouse format