8/28 Chapter 1 Flashcards

1
Q

Business Intelligence

A

a set of technologies and processes that use data to understan, analyze, and improve business performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

BI Levels

A

Access and reporting

Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

BI 1. Access and reporting

A

Ex: key performance indicator(KPI) Corporate war room

Enabling Technologies: data warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

BI 2. Analytics

A

Target marketing, Recommender systems

Enabling Technologies: data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

BI Enablers

A

Organizations have accumulated huge amounts of data due to the extensive use of IT for years
Rapid advancement of data processing capabilities of modern computers and DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Studies

A

MIT study, companies who use data driven decision making are 5% more productive and 6% more profitable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why database

A

it makes sense, excel is too confusing when you have large amounts of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why Data Warehouse

A

Knowledge Management Problems(drowning in data, starving for knowled)

  1. Can’t access data (easily)
  2. Give me only what’s important(knowledge)
  3. I need to reduce data to what’s important by slicing and dicing
  4. Data inconsisteny and poor data utility
  5. Need to improve the practice of making informed decisions
  6. Hard and slow to query the database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can’t access data (easily) why?

A

Isolated databases distributed in an enterprise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give me only what’s important

A

historical data is archived in offline storage systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Cause 3 database is designed to process transactions but not to answer decision support queries

A

Complex queries
bad query performance
Solution: in data warehouse, organize data in subject - oriented way rather than process-oriented way-dimensional modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data warehouse

A

a subject oriented, integrated, time-variant, non-volatile colleciton of data in support of management’s decision making process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Subject Oriented

A

means the data warehouse focues on high level entities of business such as sales, products, and customers. This is a in contrast to database systems, which deals with processes such as placing an order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Integrated

A

data is integrated from distributed data sources and historical data sources and stored in a consistent forma

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Time-variant

A

means the data associates with a point in time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Non-volatile

A

means the data doesn’t change once it gets into the warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data warehouse data does it change

A

once it is in the warehouse it doesn’t change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Data warehouse

A
purpose: Decision Support
data organization: subject oriented
Data model: Dimensional modeling
Time span: historical and current data
Query processing: scan a substantial subset of data
Operation: Read-Only
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data base

A
Purpose: Transaction Processing 
Data Organziation: Process Oriented
Data Model: ER Modeling
Time Span: Current Data
Query Processing: Scan a small set of data
Operation: Read & update
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Database purpose is geared towards

A

operation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Data warehouse

A

is for decision support

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Exams:

A

two in class on Sep 25 and Oct 30
conceptual and problem solving
is primarily from the lecture

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Lecture 2 Planning and Requirements Analysis

A

learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Data Warehouse Architecture

A

Operational Source System, Data Staging Area, Data Warehouse, End User Data Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Operational Source System

A

can be anywhere, we extract from this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Data Staging Area

A

area where we Transform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Transformation

A

Clean; Combine; Remove duplicates;Transform

28
Q

Data Warehouse process

A

We extract, load, and feed data to users

29
Q

Data Warehouse: Data Mart #1

A

smaller data warehouse

30
Q

End User Data Analysis

A
slide 3 of lecture 2 
Ad hoc query tools
Report Writers
End user applications
Models:
forcasting, optimizing, data mining, etc
31
Q

Data Warehouse Lifecycle

A
Project Planning
Requirements Analysis
Logical Design
Physical Design
Data Staging
Implementation Maintenance
Data Analysis
32
Q

Logical Design

A

ER Modeling-Dimensional Modeling

Design appropriate table structure and primary key/foreign key relationship

33
Q

Physical design

A
Database selection
Storage Selection
Web based or not
performance
index plan 
aggregation plan
34
Q

Data Staging

A
ETL
Extraction
Transformation
Load
Tool(Pentaho)
35
Q

Data Analysis

A

Reporting
Ad hoc query
Graphical Analysis

36
Q

Query types

A

drillup and drill down query

37
Q

Planning

A

Valuable IT projects should be
aligned with organizaiton’s business strategy and organizational structure
driven by business needs

38
Q

Planning Step 2 Feasibility Analysis

A

is it possible

39
Q

Technical Feasibility

A

less familiarity creates more risk as well as size and compatibility can be an issue

40
Q

Economic Feasibility

A

Development costs, annual oper cost, tangible benefits, intangible benefits

41
Q

Organizational Feasibility

A

is it aligned with our business?

critical success factors for DW projects

42
Q

Requirement Analysis

A

Joint application development

Questionaire, Interview

43
Q

joint application development

A

allow sthe project team, users, and management to work together to identify requirements for the DW project
often the most useful method for collecting requirements from users

44
Q

Questionnaire

A

written questions to gather data

45
Q

Document Analysis

A
provides clues aobut existing databases
typical documents
    forms
    reports
    policy manuals
    models of current databases
46
Q

Dimensional Modeling(Star Schema)

A

a DW logical design technique that seeks to present data in a freamework that is intuitive for data access and allows for high performance data access
intuitive: easy to write SQL
High performance: high performance SQL

47
Q

Analytical Report

A

Query where we list sales in jan by customer state and product category

48
Q

Parts of a dimensional model

A

fact table - ex: sales

dimension table: ex: something we want to analyze the fact by

49
Q

fact table

A

attributes in fact tables are measurements for analysis or contents in reports

50
Q

dimension table

A

attributes in dimension tables are constraints for the measurements or headers in reports

51
Q

Fact Attributes

A

purpose: measurements for analysis
reporting use: report content
data type: most facts are numerica and additive. there are semi additive or no additive facts
size: larger number of records

52
Q

Dimension Attributes

A

Purpose: constraints for the measurements
reporting use: row or column headers
data type: textual descriptive
size: smaller number of records

53
Q

How do you identify facts and dimensions

A

requirements analysis

ER Model

54
Q

F1 Calculation

A

items are calculated in a data warehouse

55
Q

F:

A

refers to special considerations for fact table or special types of fact table

56
Q

Database, should we have calculated items in it?

A

you cannot have calculated items in the database

57
Q

D1: Slowly changing dimension

A

values of attributes in dimension tables may evole over time. for ex: cust move
Option 1: Overwrite-you lose historical data
Option 2: Add a new attribute to record current value of the changing attribute- you don’t have in between data
Option 3: Add a record whenever a dimension attribute changes
Option 4: warehouse key + method 3
warehouse key is a sequence of non-negative integers served as primary keys of tables in data warehouse

58
Q

D:

A

refers to special considerations for dimension table or special types of dimension table

59
Q

Adding a key in data warehouse

A

helps us allocate data correctly, it allows us to account for change

60
Q

Notation for primary key

A

a dark rectangle

61
Q

compact primary keys

A

there can be a primary key that includes a number of items in the fact table

62
Q

D2: Time Dimension

A

part of dimension tables, a data warehouse needs more explicit dimensions to differentiate between what we are analyzing

63
Q

Primary purpose of data warehouse

A

to query items

64
Q

D3: Snowflake

A

normally should be avoided in data warehouse
advantage of avoiding:improve query performance
disadvantage: require more storage space

65
Q

look at database review slides on his page

A

do it