DAT Data Quality Flashcards
Define Data Profiling
The process of doing the initial data assessment for a data set.
What are the steps in Data Profiling?
- Individual fields - Check content against data/domain definitions. Count violations and nulls.
- Tables - Check inter-field relationships and keys. Check inter-table relationships.
- Check whole data set - Are business rules fulfilled?
What is Data Quality?
Rather than absolute quality, data quality is data being of sufficient quality for the purpose it’s being used for. This depends on how it’s used.
How can errors be handled?
Accept
Reject
Correct (rectify)
Estimate (Interpolate/extrapolate)
What are the Data Quality dimensions
AACCTUV
Accuracy
Appropriateness (correct data to solve problem)
Completeness (relevant data, not missing)
Consistency (compatible sources)
Timeliness (suitable temporal period)
Uniqueness (each record corresponds to a real world item)
Validity (data follows rules)
What is the acronym relating to Data Security?
CIA
Confidentiality (auth access only)
Integrity (auth changes only)
Availability (when needed, may use SLA [Service Level Agreement])