8/28 Chapter 1 Flashcards
Business Intelligence
a set of technologies and processes that use data to understan, analyze, and improve business performance
BI Levels
Access and reporting
Analytics
BI 1. Access and reporting
Ex: key performance indicator(KPI) Corporate war room
Enabling Technologies: data warehouse
BI 2. Analytics
Target marketing, Recommender systems
Enabling Technologies: data mining
BI Enablers
Organizations have accumulated huge amounts of data due to the extensive use of IT for years
Rapid advancement of data processing capabilities of modern computers and DBMS
Studies
MIT study, companies who use data driven decision making are 5% more productive and 6% more profitable
Why database
it makes sense, excel is too confusing when you have large amounts of data
Why Data Warehouse
Knowledge Management Problems(drowning in data, starving for knowled)
- Can’t access data (easily)
- Give me only what’s important(knowledge)
- I need to reduce data to what’s important by slicing and dicing
- Data inconsisteny and poor data utility
- Need to improve the practice of making informed decisions
- Hard and slow to query the database
Can’t access data (easily) why?
Isolated databases distributed in an enterprise
Give me only what’s important
historical data is archived in offline storage systems
Cause 3 database is designed to process transactions but not to answer decision support queries
Complex queries
bad query performance
Solution: in data warehouse, organize data in subject - oriented way rather than process-oriented way-dimensional modeling
Data warehouse
a subject oriented, integrated, time-variant, non-volatile colleciton of data in support of management’s decision making process
Subject Oriented
means the data warehouse focues on high level entities of business such as sales, products, and customers. This is a in contrast to database systems, which deals with processes such as placing an order
Integrated
data is integrated from distributed data sources and historical data sources and stored in a consistent forma
Time-variant
means the data associates with a point in time
Non-volatile
means the data doesn’t change once it gets into the warehouse
Data warehouse data does it change
once it is in the warehouse it doesn’t change
Data warehouse
purpose: Decision Support data organization: subject oriented Data model: Dimensional modeling Time span: historical and current data Query processing: scan a substantial subset of data Operation: Read-Only
Data base
Purpose: Transaction Processing Data Organziation: Process Oriented Data Model: ER Modeling Time Span: Current Data Query Processing: Scan a small set of data Operation: Read & update
Database purpose is geared towards
operation
Data warehouse
is for decision support
Exams:
two in class on Sep 25 and Oct 30
conceptual and problem solving
is primarily from the lecture
Lecture 2 Planning and Requirements Analysis
learn
Data Warehouse Architecture
Operational Source System, Data Staging Area, Data Warehouse, End User Data Analysis
Operational Source System
can be anywhere, we extract from this
Data Staging Area
area where we Transform