Data Quality Flashcards
are numbers, words, or images that have yet to be organized or analyzed to answer a specific question
Data
produced thru processing, manipulating, and organizing data to answer questions, adding to the knowledge of the receiver
Information
it is the overall utility of a dataset(s) as a function of its ability to be processed easily and analyzed for a database, data warehouse, or data analytics system
Data Quality
Aspects of Data Quality
Accessibility
Accuracy
Presentability
Completeness
Consistency
Reliability
Relevance
considers the extent to which data is collected using the same process and procedures by everyone doing the collecting and in all locations everytime
Consistency
indicates whether the data is free from significant errors and whether the numbers seem to make sense
Accuracy
indicates whether there is enough information to draw a conclusion about the data and whether enough individuals responded to it to ensure
representativeness
Completeness
refers to the degree to which data are important to users and their needs
Relevance
is determined by the degree to which measurements are similar (consistent) on repeated measurements
Reliability
degree from which the data is easily understood and well organized
Presentability
These (aspects of data quality) are the parameters of data quality to determine whether the data collected are of quality or not
Accessibility
is a tool that allows the use of small random samples to distinguish between different groups of data elements with high and low data quality
Lost Quality Assurance Sampling
Steps in Applying LQAS
- Define the service to be assessed (ex: data quality assurance of district HIS)
- Identify the unit of interest (e. g. supervisory area, facility, hospital, or district)
- Define the higher and lower thresholds of performance (based on prior information about the expected performance of region of interest)
- Determine the level of acceptable error
- Determine the sample size and decision rule for acceptable errors (especially in declaring areas as
performing below expectations) - Identify the number of errors observed
(mismatched data elements that will reliably determine if the facility is performing above or below expectations)
it is a simplified version of the Data Quality Audit tool
which allows programs and projects to verify and 5.
assess the quality of their reported data
Routine Data Quality Assessment
goal of Routine Data Quality Assessment
strengthen data management and reporting systems
Objectives of RDQA
- Rapidly verify the quality of reported data for key indicators at selected sites
- Implement corrective measures with action plans for strengthening data management and reporting system and improving data quality
- Monitor capacity improvements and performance of data management and reporting system to produce quality data
it is a project management tool that illustrates how a
project is expected to progress at a high level
Implementation Plan
Development Implementation Plan Steps
- Define Goals/Objectives
- you address the question: “what do you want to accomplish?”
- it should be SMART (specific, measurable, attainable, realistic, time-bound) - Schedule Milestones
- outline deadlines and timelines in the implementation phase
- ex: gantt chart - Allocate Resources
- determine whether resources are sufficient and decide ways on how to procure the missing resources - Designate team members responsibilities create a general team plan with overall goals that each team member will play
- Define metrics for success
- how will you determine if you have achieved your goal or not?
it analyzes information and identifies incomplete or incorrect data
Data Quality Tool
it follows after the complete profiling of data concerns, which could range anywhere from removing abnormalities to merging repeated information
Data Cleansing
the decomposition of fields into components parts and formatting the values into consistent layouts based on industry standards and patterns and user-defined business rules
Parsing and Standardization
defines these data quality tools as being used to address problems in data quality:
Gartner (2017)
the modification of data values to meet domain restrictions, constraints on integrity, or other rules that define data quality as sufficient for the organization
Generalized “Cleansing”
the identification and merging related entries within or across datasets
Matching
the analysis of data that captures the statistics or metadata to determine the quality of data and identify data quality issues
Profiling
the deployment of controls to ensure conformity of data to business rules set by the organization
Monitoring
is enhancing the value of the data by using related attributes from external sources such as consumer demographic attributes or geographic descriptors
Enrichment
it is a problem solving method that identifies the root case of problems or events instead of simply addressing the obvious symptoms
Root Cause Analysis
Root Cause Analysis Techniques
● Five Whys Analysis
● Failure Mode and Effects Analysis (FMEA)
● Pareto Analysis
● Fault Tree Analysis
● Current Reality Tree (CRT)
● Fishbone or Ishikawa or Cause-and-Effect Diagrams
● Kepner-Tregoe Technique
● RPR Problem Diagnosis
aims to find the various modes of failures within a system. it is used when there is a new product or process or when a problem is reported thru customer feedback
Failure Mode and Effects Analysis
uses the Pareto Principle (20% of work produces 80% of result)
Pareto Analysis
used when there are multiple potential causes to a problem
Pareto Analysis
used in risk and safety analysis
Fault Tree Analysis
- uses boolean logic to determine the root causes of
an undesirable event
Fault Tree Analysis
used when the root causes of multiple problems need to be analyzed all at once
Current Reality Tree
also called as the “Ishiwaka or cause-and-effect diagram”
Fishbone Diagram
- categorizes the causes and sub-causes of a problem
- useful in grouping causes into categories
Fishbone Diagram
it breaks a problem down to its root cause by assessing a situation using priorities and orders of concern for specific issues
Kepner-Tregoe Technique
Rapid Problem Resolution (RPR) Diagnosis diagnosis problem by
- Discover — data gathering and analysis of the
findings - Investigate — creation of diagnostic plan and
identification of the root cause thru careful
analysis of the diagnostic data - Fix — fixing the problem and monitoring to
confirm and validate that the correct root cause was identified
meaning of RPR Diagnosis
Rapid Problem Resolution (RPR) Diagnosis