Chapter 18 Flashcards

1
Q

What is incremental refresh

A

Update on data in incremental change from operation system to DWH

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 2 sources for DWH

A
  • Modern system

- Legacy system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is CDC

A

Change data capture (CDC) is the process of capturing changes made at the data source and applying them throughout the enterprise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is legacy system

A

System that used in 90’s and still in use to some extent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are CDC in modern systems

A
  • Time stamps
  • Triggers
  • Partitioning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is time stamp CDC

A

When ever there is a DML operation, a transaction is store for telling its date and time in a separate column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is trigger

A

When ever there is a DML operation, a record with time stamp is stored in a separate file and it can be used in extraction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is partitioning

A

Data is logically divided in to partitions. And whenever we need data for some period, we target that period table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are CDC in legacy systems

A
  • Changes recorded in tapes
  • Changes read and removed from tapes
  • Problem with reading a journal tape are many
    (All operations are recorded on tapes instead tables)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are advantages of CDC in legacy system

A
  • No incremental online
  • The log tape captures all update processing
  • Log tape processing can be taken off-line
  • No haste to make waste
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are major transformation types

A
  • Formal revision
  • Decoding of fields
  • Calculated and derived value
  • Splitting of single fields
  • Character set conversion
  • Unit of measurement conversion
  • Date/time conversion
  • Summarization
  • Key restructuring
  • DeDuplication
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is merging

A

Collect information from different columns and get them in one place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is aggregation

A

Suppose we already have some calculations. Now we make combinations of that pre-defined calculation and get results. That result is called aggregation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is deduplication

A

Remove duplication in transformation process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly