W03 Data Ware Houses and OLAP Flashcards

1
Q

Data Munging
3 steps
need to know?

A

Selection
Pre-Processing
Transformation

How and Why data is stored?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Business Intelligence

3 steps

A

1 Collect and Store in database system

2 Extract, transform and load into central data warehouse

3 Analyze and present data for business intelligence

  • > consolidate, analyse and present data for decision support
  • > descriptive analytics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Database Systems

A

1 Store large data sets

  • all addresses
  • all orders
  • all current contracts
  • stored in relational tables

2 Analyse data sets

  • vis SQL
  • list of all customers in berlin with revenue higher 10k
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Databases in IT Systems (layers)

A

Presentation
-user interface in app or web browser

Logic
-computed as implemented in a programming language

Storage
-Data Base System

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A Data Base System (equation)

A

Data Base + Data Base Management System =Data Base System

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Tasks of the Database Management system

A

1 interpret queries (log layer (sql) to low level)

2 optimise queries for given sql query

3 ensure integrity
no disintegrating manipulations

4 control access: user authorization

5 manage simultaneous access

6 manage back-ups

7 manage files: memory allocation etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The Relational Data Model

A

primary keys (unique identifiers)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Entity Relationship Model

A

One employee (with unique ID) attends to one (or more) customer with unique ID who each have only one employee….

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data Warehouses

A
  • database with reporting and query functions
  • stores operative and historical data
  • data extracted form diverse operational systems
  • processes to support management reports and analysis (OLAP)
  • multi-dimensional
  • long-term
  • historical, cleaned, validated, synthetic, operative data
  • internal and external sources
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Operational Database vs Data Warehouse

A

operational data (day to day transactions) VS strategic data

complete data VS historical data

detailed data VS aggregate data

non-redundant data VS more read than edited

dynamic data VS ad-hoc queries

complex data model VS polished user interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ETL Process

A

Extract
relevant data from diverse sources

Transform
data into the format of the central database system

Load
data into the central database systen

Data Warehouse is an organisation-wide concept aiming to provide

  • complete, central, consistent data basis
  • independent of operational databases
  • support diverse analytics projects
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

working with a data warehouse

A

1 select well-suited attributes form operative data bases

2 add selected data from external sources

3 transform and load data

4 store data in dimensions

5 manage database

6 search and analyze via reports or OLAP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Components of a Data-Warehouse-System

A

1 Data Marts

2 Central Data Warehouse

3 Enterprise Data Warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

1 Data Marts

A

-databases targeted to analytics requirements of user group
-managed by a decentral team
simple data model and development
-mostly self-contained

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

2 Central Data Warehouse

A
  • analytics database that feeds data into local data sets

- can still provide information for just a part of the organisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

3 Enterprise Data Warehouse

A

provides data to support analyitcs across organization

17
Q

Hierarchical Data-Warehouse structure

A

central data warehouse (CDWH) feeds and coordinates local data marts

18
Q

Exemplary Queries

-personal data mart

A

analyse customers according to dimension “location” in hierarchies “country” ; “region” ; “city”

19
Q

CRM (Customer Relationship Management)

  • what?
  • how?
  • used with what tools?
  • used with what goals?
A

CRM Analytics: examine customer data to support decisions on improving products, services, and the market interface

  • consistent view of customer by integrating all data sources in data warehouse
  • OLAP and data mining
  • identify preferences
  • decide on offers (cross-selling)
  • classify by profitability and potential for marketing
  • determine degree of service
20
Q

OLAP (online analytical processing)

what?
reports?
query languages?
OLAP?

A

enables decision makers to flexibly access data to support complex analyses

Static Reports

  • can be parametrized
  • structural changes have to be implemented by programmers

Query Languages

  • standardised and powerful
  • complex to learn
  • SQL, QBE

OLAP enables flexible ad-hoc queries without requiring high technical expertise

21
Q

OLAP cube

A
to present data from warehouse
clear arrangement
dimensions:
time period
area
product
22
Q

OLTP

A

OnLine Transactional Processing (from operational databases)

23
Q

OLAP

A

OnLine Analyticsl Processing (lets users quickyl access multiple dimensions of data for interactive analysis of data from data warehouse)

24
Q

Functions of OLAP

  • representation approaches
  • cube operations
A
  • absolute vs relative
  • 3d analysis, hierarchical levels
  • diverse indicator computations
  • drilling (aggregation level)
  • pivoting (switching rows and columns)
  • slicing (reduce number of dimensions)
  • dicing (cut out parts of current cube, “filter”)
25
Q

OLAP Slice

A

horizontal slice of OLAP cube

e.g. ONE product variant across areas and time

26
Q

Pivoting

A

Results in better representation

27
Q

Drill-Down and Dice

A

subset of data (a dice)

28
Q

Limits of OLAP

A

machine-supported manual search

+limited number of hypotheses
+limited scaled applicability

“What drives cancellation probabilities?”
->need for manually-supported machine search -> Data Mining

29
Q

Summary: Business Intelligence

A

1 Support analytics for managers
2 OLAP, statistics and datamining
3 data marts provide function-specific EXTRACTS from data warehouse
4 data warehouse provides enterprise-wide database for analytics
5 automated ETL process connects databases to datawarehouse

30
Q

NoSQL

A

+availability
+partition tolerance
+speed
+direkt input for datamining

-can sacrifice consistency and integrity