Big Data Lecture 14 Data Cubes Flashcards

1
Q

What is OLTP?

A

Online transaction processing, both read and write intensive system in real time, example is relational database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is OLAP?

A

Online analytical processing, we want to agreggate data, can be not fully online, example are Data Cubes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main differences of OLTP and OLAP in IO?

A

OLTP writes mostly, OLAP reads often.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference of OLAP and OLTP in detail of information it provides about data?

A

OLTP has detailed individual records, OLAP historical summarized consolidated data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference of OLTP and OLAP in terms of size of transactions?

A

OLTP has small transactions, OLAP has few huge queries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference of OLAP and OLTP in execution time?

A

OLTP is interactive, OLAP takes its time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the difference of OLAP and OLTP in terms of duplication and denormalization?

A

OLTP is consistent, OLAP has redundant data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What 4 adjectives define OLAP?

A

<ul><li>Subject oriented: usually for one use case (web analytics, sales...),</li><li>time-variant: the data tends to be historical,</li><li>integrated: loads data from many different sources,</li><li>non-volatile: the data is locked once it is loaded.</li></ul>

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the main data shape in OLAP?

A

Data cube! Multidimensional object with many dimensions, which are indexed by members.<br></br>E.g. YEAR x PRODUCT x COUNTRY identifies a unique product sold.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a fact table?

A

Table that stores for each coordinate combination a value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a measurement?

A

Value stored in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is slicing?

A

Taking a slice of the cube over some dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is dicing?

A

Reformatting the table such that the axis of the table become the dimensions, this is called the cross-tabulated view.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is aggregation from the POV of the cube?

A

It is like squeezing a cube along one dimension.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a roll-up?

A

We combine two categories in aggregation together, to create a lower dimensional view of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a drill-down?

A

We make the data more granular, by considering more subvariables in each category, making the view more precise, less aggregated.

17
Q

What is the difference of ROLAP and MOLAP?

A

Relational OLAP (built on top of RDBMS) and Multidimensional OLAP (natively deals with multiple dimensions).

18
Q

What is pivoting (and unpivoting)?

A

Pivoting takes dimensions and makes the associated values into a different measurements. Unpivoting reverses that.

19
Q

What is a star schema of tables?

A

We keep one main table with all dimensions, and then we have <i>satellite tables </i>which reference the remaining data.

20
Q

What is snowflake schema?

A

Within each table of the star schema, we further normalize the data into more tables.

21
Q

What is MDX?

A

Stands for Multi-Dimensional eXpressions. The query language for Data Cubes.

22
Q

How to roll up in SQL?

A

Use group by, and the dimensions that you want to keep insert into the groupings.

23
Q

How to cube in SQL?

A

You can ‘GROUP BY GROUPING SETS’ and then denote all the combinations, or ‘GROUP BY CUBE’ and just input the attributes, it will produce the 2^n different sets we need to group on.

24
Q

How to roll up in SQL?

A

You can use ‘GROUP BY ROLLUP’ that produces all the attributes for (n+1) types that we need.

25
Q

How do authorities deal with fancy reports from companies to make the readeable to the system?

A

They let them produce the reports in XHTML, and make them annotate the important attributes for parsing. Later, data cubes can be used for simple aggregation online.