Data Warehouses Flashcards

1
Q

OLTP

A

Online Transaction Processing
- ACID properties required
- Details about every transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

OLAP

A

Online Analytical Processing
- Data redundancy allowed
- Historical data
- Consistency before and after loads is required

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Decision Support Systems (DSS)

A

Huge operational database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Operational Database vs. Data Warehouse

A

Operational stores every transaction while data warehouse stores aggregations of events through ETL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ETL

A

Extraction: Acquires data from sources
(Cleaning): Improves data quality
Transformation: Transforms data to DW format
Loading: Propagates changes
(Loads data to DW)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data Mart

A

Small warehouse that focuses on one specific target (ex: one single store)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dimensional Fact Model

A

Graphical model supporting conceptual design.
Fact schema modelling:
- Dimensions
- Hierarchies
- Measures
DRAW IT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

DFM: Dimension

A

Attributes of a fact, typically categorical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

DFM: Fact

A

Event of interest.
Evolves with time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

DFM: Measure

A

A numerical property of a fact, typically aggregated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Additive vs. Non Additive

A

Non additive cannot be aggregated along a given hierarchy by means of the SUM operator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Distributive vs. Non DIstributive (Algebraic)

A

Algebraic can compute higher level aggregations from more detailed data only when supplementary support measures are available, such as AVG.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Star Schema

A

Draw it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Snowflake Schema vs. Star Schema

A
  • Snowflake usually not recommended
  • Partitions dimension tables in multiple extra-tables.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SQL: Range vs. Rows

A

Range counts on the past record, the rows past values (not ideal for data with gaps)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

ROWS option1 option2

A

Option 1:
- BETWEEN
- UNBOUNDED

Option 2:
N PRECEDING
M FOLLOWING

17
Q

ROWS N PRECEDING = how many rows?

A

N + current row = N+1

18
Q

Materialized View

A

Precomputed summary of the fact table

19
Q

SQL: Divide a partition in N subgroups

A

NTILE(N)

20
Q

SQL: Add index to a partition

A

ROW_NUMBER

21
Q

SQL: Add 0-1 indexes

A

CUME_DIST

22
Q

SQL: Group by all possible combinations

A

CUBE

23
Q

SQL: Group by -> (C, M, P) (C, M) (C)

A

ROLLUP(C, M ,P)

24
Q

SQL: Group by using different (specified) combinations

A

GROUPING SETS

25
Q

ETL Techniques (types of triggering)

A
  • Trigger: New or modified data
  • Periodically
  • Application Assisted
26
Q

Create a materialized view

A

CREATE MATERIALIZED VIEW MyView
BUILD ________
REFRESH ________
AS
SQL QUERY

27
Q

Create Trigger

A

CREATE TRIGGER TriggerName Mode Event ON TargetTable [ REFERENCING ReferenceName ]
FOR EACH Level
WHEN Predicate
DECLARE
BEGIN
Procedural SQL Statements
END

28
Q

Create materializer view log

A

CREATE MATERIALIZED VIEW LOG ON ________
WITH SEQUENCE, ROWID
(ForeignKeys, Measures)
INCLUDING NEW VALUES;

29
Q

SQL: BUILD _________

A
  • Immediate
  • Differed
30
Q

SQL: REFRESH _________

A
  • COMPLETE: All data
  • FAST: Only changes
  • FORCE: Try FAST, else COMPLETE
  • NEVER ON COMMIT: “ “
  • ON COMMIT: When SQL ops affect content
  • ON DEMAND: When asked
31
Q

SQL: Delete DB

A

DROP