Chapter 22 Flashcards
What is quantify
Measurement of quantity
Does data quantify subjectivity or objectivity
On both
What are methods for data quantify objectively
- Dependent
- Independent
What is independent objective data quantification
Data is independent. Data does not effect by organization rules
What is dependent objective data quantification
Data is dependent and effect by organization rules
Please tell more data dimension
- Believability
- Appropriate amount of data
- Timeliness
- Accessibility
- Objectivity
- Interpretability
- Uniqueness
What are data quality assessment techniques
- Min-max
- Simple ratio
- Weighted average
What is positive simple ratio
It is the ratio of desirable records with reference of total number of records subtract from 1
What is negative simple ratio
It is the ratio of undesirable records with reference of total number of records subtract from 1
What simple ratio is used in longitudinal analysis
positive simple ratio
What is min-max data quality assessment technique
It relates to set of data and min or max of them. First we convert attributes of data normalize and then we take min max. When we take minimum value we are conservative and when we take maximum we are liberal.
What is free of error ratio:simple ratio
Negative simple ratio
What is completeness of data:simple ratio
Completeness of data can be measured through 3 aspects.
1- Schema
2- Column completeness
3- Population
What is consistency:simple ratio
There are 2 types of consistency
1- Variation (e.g. karachi, khi, KHI, Karachi)
2- Functional integrity
What are min-max measurements
- Believability
- Appropriate amount of data
- Timeliness
- Accessiblity
Does believability a minimum (conservative) value
Yes
What is the formula of timeliness
Max {0, 1-CM}
C = A + Dt - It
(Dt = Delivery time, It = Input time)
What is formula of accessibility
Max {0, 1-trd/tru}
What comes first assessment or validation
assessment
What are methods for validation
1- Referential integrity validation (records without reference)
2- Attribute domain validation
3- Business rules
4- Analyze data
What is data profiling
Data profiling is the process of examining the data available from an existing information source (e.g. a database or a file) and collecting statistics about that data.
What are orphan records
Records without reference
What are 3 steps for attribute domain validation
Step 1- Capture and quantify
Step 2- Compare
Step 3- Investigate
What is histogram
History of data in DWH. It can be represented in graph form.