Chapter 4 Test Deck Flashcards

1
Q

Data Warehouse

A

A subject-oriented, integrated, time-variant, and nonvoliatile collection of data in support of manament’s decision-making process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data warehousing

A

The process of constructing and using data warehouses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Subject-Oriented

A

organized around Major subjects such as customer, product, sales
Focusing on modeling and analysis of data for decision makers
(OLTP - Online Transaction Processing)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Integrated

A

Constructed by integrated multiple, heterogeneous data sources
relational databases, flat files, online transaction records
data cleaning and data integration techniques are applied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

TIme variant

A

Time horizon significantly longer than operational systems: current value data
data warehouse data: provide info from historical perspective 5-10 years
summarized and historical records, patterns
Contains an element of time, explicitly or implicitly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nonvolatile

A

A physically separate store of data from the operational environment
Update of data does not occur in the data warehouse environment
not require transaction processing, recovery, and concurrence control mechanisms
requires initial loading of data and access of data
OLAP- Online Analytical Processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

OLTP

A
Clerk, IT professional
day to day operations
application-oriented E-R model
current, up-to-date detailed
repetitive
read/write
short simple transaction
tens
thousands
100MB-GB
transaction throughput
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Olap

A
knowledge worker
decision support
subject-oriented star schema
historical multidimensional
ad-hoc
lots of scans
complex query
millions
hundreds
100Gb-TB
query throughput, response
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DBMS

A

Tuned for OLTP access methods, indexing, concurrency control, recovery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

warehouse

A

tuned for OLAP: complex OLAP queries, multidimensional view, consolidation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Enterprise warehouse

A

collects all of the information about subjects spanning the entire organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data Mart

A

A subset of corporate-wide data that is of value to a specific group of users. Scope confined to specific, selected groups, such as marketing data market. Independent vs dependent data marts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

virtual warehouse

A

A set of views over operational databases

only some of the possible summary views may be materialized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data extractiom

A

Get data from multiple, heterogenous, and external sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data cleaning

A

detect errors in the data and rectify them when possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data Transformation

A

Convert data from legacy or host format to warehouse format

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Load

A

sort summarize consolidate, compute views, check integrity, and build indices and partitions

18
Q

Refresh

A

propagate the updates from the data sources to the warehouse

19
Q

Meta data

A

the data defining warehouse objects
stores: description of the structure of the data warehouse: schema, view, dimensions etc.
The algorithms used for summarization
the mapping from operational environment to the data warehouse
data related to system performance
operational meta-data
business data, terms, ownership of data charging policies

20
Q

operational meta-data

A

data lineage, history of migrated data and transformation path,
currency of data, active, archived, or purged,
monitoring information, warehouse usage statistics, error reports, audit trails

21
Q

A data cube

A

allows data to be modled and viewd in multiple dimensions
dimension tables and fact tables
it is n-dimensional, 4 d cubes can be a series of 3-d cubes
physical storage may differ from its logical representation

22
Q

Cuboid

A

data cube often referred as a cuboid

The lattice of cuboids forms a data cube

23
Q

base cuboid

A

n-d base cube

24
Q

apex cuboid

A

the top most 0-D cuboid which holds the highest-level of summarization

25
Q

Star schema

A

A fact table in the middle connected, via foreign key to primary key relationship, to a ser of dimension tables

26
Q

Snowflake schema

A

a refinement of star schema where some dimensional hierarchy is normalized into a set of smaller dimension tables forming a shape similar to snowflake
Reduce redundancy

27
Q

Fact Constellations

A

multiple fact tables share dimension tables, viewed as a collection of stars, therefore, called galaxy schema pr fact constellation

28
Q

Data warehouses vs datamart

A

data wearhouse collects information about subjects that span the entire organization, it is enterprise-wide
for warehouse, the fact constellation schema is commonly used
a data mart is a departmental subset of the data warehouse, it is department-wide
the star or snowflake is commonly used
star schema is more popular and efficient, less joins

29
Q

Roll up or drill up

A

summarize data by clibing up hierarchy or by dimension reduction

30
Q

drill down or roll down

A

the reverse of roll up

from higher-level summary to lower level summary or detailed data or introducing new dimensions

31
Q

slice and dice

A

slice: select form 1 D
Dice: select 2 or more Ds from a subcube

32
Q

pivot or rotate

A

reorient the cube, visualization 3d series of 2d blanes

33
Q

drill across

A

involving more than one fact table

34
Q

drill through

A

through the bottom level of the cube to its back-end related tables using SQl

35
Q

Information processing

A

supports querying, basic statistical analysis, and reproting using crosstabs, tables, charts and graphs

36
Q

analytical processing

A

multidimensional analysis of data warehouse data

support basic olap operations, slice-dice, drilling,pivoting

37
Q

Data mining

A

knowledge discovery from hidden patterns
supports associations, constructing analytical models, performing classification and prediction, and presenting the mining results using visualization tools

38
Q

Relational OLAP or ROLAP

A

use relational or extended relaional DBMS to store and manage waehosue dat and client front end using OLAP middleware
greater scalability than MOLAP technology

39
Q

Multidimensional OLAP or MOLAP

A

sparse array-based multidimensional storage engine

fast indexing to precomputed summarized data

40
Q

Hybrid OLAP or HOLAP

A

e.g. microsoft SQLServer
combines ROLAP and MOLAP technology
flexible e.g. low level relational, high level array

41
Q

Specialized SQL servers

A

e.g. Redbricks

specialized support for SQL queries over star.snowflake schemas