Week 13 UAS Flashcards

1
Q

What is Data Warehousing?

A

A subject-oriented, integrated, time- variant, and non-volatile collection of data in support of management’s decision- making process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Characteristics of Data Warehouse?

A
  1. Subject-oriented
    The warehouse is organized around the major subjects of the enterprise (e.g. customers, products, and sales) rather than the major application areas (e.g. customer invoicing, stock control, and product sales).
  2. Integrated
    The integrated data source must be made consistent to present a unified view of the data to users.
  3. Time-variant
    Data in the warehouse is accurate and valid only at some point in time or over some time interval.
  4. Nonvolatile
    Data is not updated in real time but is refreshed from operational systems on a regular basis.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Benefits of Data Warehousing?

A
  1. Potential high returns on investment
  2. Competitive advantage
  3. Increased productivity of corporate decision- makers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Comparison of OLTP Systems?

A

Main Purpose: support operational processing
Data age: current
data latency: real-time
data granularity: detailed data
data processing: predictable pattern of data operations and queries. High level of transaction throughput.
reporting: predictable, one-dimensional, static reporting
users: serves large number of operational users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Comparison of Data warehousing?

A

Main Purpose: support analytical processing
Data age: historic
data latency: time-variant
data granularity: detailed data, lightly and highly summarized data
data processing: less predictable pattern; medium to low level of transaction throughput.
reporting: unpredictable, multidimensional, dynamic reporting
users: serves lower number of managerial users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Problems of Data Warehousing?

A
  1. Underestimation of resources for data ETL
  2. hidden problems with source systems
  3. required data not captured
  4. increased end-user demands
  5. data homogenization
  6. high deman for resources
  7. data ownership
  8. high maintenance
  9. long-duration projects
    10 complexity of integration
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data Warehouse Architecture?

A
  1. Operational Data
  2. Operational Data Store
  3. ETL Manager
  4. Warehouse Manager
  5. Query Manager
  6. Detailed Data
  7. Lightly and Highly Summarized Data
  8. Archive/Backup Data
  9. Metadata
  10. End-User Access Tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

End-User Access Tools?

A
  1. reporting and query tools
  2. application developments tools
  3. online analytical processing (OLAP) tools
  4. data mining tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Data Mart?

A

A database that contains a subset of corporate data to support the analytical requirements of a particular business unit (such as the Sales department) or to support users who share the same requirements to analyse a particular business process (such as property sales).

Benefits:

  1. To give users access to the data they need to analyze most often.
  2. To improve end-user response time due to the reduction in the volume of data to be accessed.
  3. To provide appropriately structured data as dictated by the requirements of the end- user access tools.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data Warehousing Tools and Technologies?

A
  1. Extraction
  2. Transformation
  3. Loading
  4. ETL Tools
  5. Data profiling and data quality control
  6. Metadata Management
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The requirements for a data warehouse DBMS?

A
  1. Load performance
  2. Load processing
  3. Data quality management
  4. Query performance
  5. Terabyte scalability
  6. Mass user scalability
  7. Networked data warehouse
  8. Warehouse administration
  9. Integrated dimensional analysis
  10. Advanced query functionality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ETL (Extraction, Transformation, Loading) processes?

A
  1. Extraction
  2. Transformation
  3. Loading

The data destined for an data warehouse must first be extracted from one or more data sources, transformed into a form that is easy to analyze and consistent with data already in the warehouse, and then finally loaded into the data warehouse.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

4 main operations in Data Mining?

A
  1. Predictive modelling
    a. Classification
    b. Value prediction
  2. Database segmentation
    a. Demographic clustering
    b. Neural clustering
  3. Link analysis
    a. Association discovery
    b. Sequential pattern discovery
    c. Similar time sequence discovery
  4. Deviation detection
    a. Statistics
    b. Visualization

example of application:
Retail / Marketing
- Identifying buying patterns of customers
- Finding associations among customer demographic characteristics
- Predicting response to mailing campaigns
- Market basket analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is OLAP?

A

online analytical processing (OLAP) is the dynamic synthesis, analysis, and consolidation of large volumes of multidimensional data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Phases of the CRISP-DM Model?

A
  1. Business understanding
  2. Data understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
  6. Deployment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data mining and data warehousing?

A

a. major challenge = identifying suitable data to mine
b. data mining requires single, separate, clean, integrated and self-consistent source of data
c. data warehouse is well equipped for providing data for mining.