Data Warehouse Flashcards
Learn Data Warehouse and components & tools
What are the 12 rules of identifying a Data Warehouse?
• Dw and Operational environments differ
• It’s data is:
- Integrated
- Subject-oriented
- Read only with periodic updates from op. data
- Captured at given time
• Has historical data in long term
• It’s development cycle and classical systems development differ. One is data driven, one process driven
• Has data in levels of detail..
• Environment characterized by read only transaction data into large data sets
• Has a system that traces data sources, transformations and storage
• it’s metadata is a critical component as it identifies and defines data elements, and provides integration, usage, relationships, history of each data element
• has Chargeback Mechanism for resource usage to enforce optimal data usage by end users
What is a Star Schema?
It maps multidimensional decision support data into a relational database.
It is an easily implemented model for data analysis while preserving relational structures
What are the 4 Star Schema Components?
FACTS
- numeric measurements showing business activities
- stored in Fact Table which contains facts linked through their dimensions
DIMENSIONS
- qualifies characteristics to provide perspectives to a given fact
- decision support data viewed with other related data
- studies facts through dimensions
- stored in Dimension Table
ATTRIBUTES
- search, filter, classify facts
- used by dimensions to provide a facts’ description
- no limit to no. of dimensions
- focus: slice of data cube for detailed analysis
ATTRIBUTE HIERARCHY
- has Drop Down Data organization
- purpose: aggregation, drill down/roll up data analysis
- determines data extraction, representation
- stored in DBMS’s Data Dictionary
- used by OLAP tool for accessing warehouse
What is Data Analytics?
- examines data sets to illustrate conclusions on contained information
- uses specialized systems and software
- it’s techniques & technology used to allow companies to create informed business decisions
- it’s scientists & researchers verify/disprove scientific models, theories, hypotheses
What are the tools of Data Analytics?
Explanatory/Exploratory Analysis
- ensures full understanding of data
- may begin with questions, hypothesis, delving into data to determine what stands out in key issues
- highlights specific details within issues
Predictive Analysis
- advanced analytics
- for new & historical data to forecast activity, behaviour, trends
- includes statical analysis, analytical queries, automated machine which learns algorithms to dtaa sets for creating predictive models
- models: places numerical values and s ore on likelihood of an event
- used in customer relationships, service, fraud detection, retention, optimized pricing
What is OLAP?
Online Analytical Processing
software that allows users to analyze information from multiple database systems and extract and view business data from different perspectives
(explain further by OLAP data process diagram)
What are the OLAP Characteristics?
Multidimensional Data Analysis Techniques
- data processed and viewed as part of multidimensional structure
- augmented by advanced functions:
• presentation
• aggregation, consolidation, classification
• computation
• modelling
Advanced Database Support
- access number of DBMS’s flat files, internal&external data sources
- access to aggregated DW data
- advanced data navigation
- rapid & consistent as query response times
- maps end user requests to appropriate data source and proper data access language
- support for large DB
Easy to use, end user Interface
- features borrowed from previous generations of DA tools