Data Quality Flashcards
IT SIGNIFIES THE DATA’S APPROPRIATENESS TO SERVE ITS PURPOSE IN A GIVEN CONTEXT
Data Quality
Is the overall utility of a dataset(s) as a function of its ability to be processed easily and analyzed for a database, data warehouse, or data analvtics system
Data Quality
Data Quality
Used in the areas of:
Customer relationship management
(CRM)
Data integration
Regulation requirements
generates costs, affects customer satisfaction, company reputation, and even strategic decisions of the management
Poor data quality
A tool that allows the use of small random samples (19 sxs) to distinguish between different groups of data elements (or lots) with high and low data quality
Widely applied in the health care industry for decades and has been used for quality assurance of products
Lot Quality Assurance Sampling (LQAS)
- Smallest sampling size to use and still become statistically accurate. Samples that are more than____ are more expensive while sampling size less than___ is not accurate.
19 sxs
is adopted in the context of District Health Information System (DHIS) data quality assurance (DQA)
Lot Quality Assurance Sampling (LQAS)
Formula for report timeliness rate
= # of on-time reports / total # of
reports for that section x 100
Level of acceptable error =
70% +/-10% (60 - 80%)
• It is a simplified version of the Data Quality Audit (DOA) tool which allows programs and projects to verify and assess the quality of reported data
Routine Data Quality Assessment (RDQA)
Rapidly verify the quality of reported data
Implement corrective measures with action plans for strengthening data management and reporting system and improving data quality
RDQA
EXAMPLE: DENGUE PREVENTION AND CONTROL
PROGRAM IN MINDANAO
• External auditors are important to check for flaws in the system and their visits can be more frequent, more organized, and less resource intensive to benefit the institution at the end of the day
RDQA
• A project management tool that illustrates how a project is expected to progress at a high level
• Important in ensuring the efficient flow of communication between those involved in the project
• Minimize issues that would delay delivery of the project
Development Implementation Plan
TOOLS THAT ARE CRUCIAL IN MAINTAINING ACCURACY & RELEVANCY IN HEALTH INFORMATION
Data Quality Tools
• Analyzes information and identifies incomplete or incorrect data
Data Quality Tools
- removing of abnormalities of data or repeated information
Data cleansing
By maintaining________, the process enhances the reliability of the information used by an organization
data integrity
• Decomposition of fields into component parts and formatting the values into consistent layouts based on industry standards and patterns and user-defined business rules
Parsing and standardization
Is the modification of data values to meet domain restrictions, constraints on integrity, or other rules that define data quality as sufficient for the organization
Generalized cleansing
Identification and merging of related entries within or across data sets
Matching
Refers to the analysis of data to capture statistics or metadata to determine the quality of the data and identify data quality issues
Profiling
Refers to the deployment of controls to ensure conformity of data to business rules set by the organization
Monitoring
Enhancement of the value of the data by using related attributes from external sources such as consumer demographic attributes or geographic descriptors
Enrichment
PROBLEM SOLVING METHOD THAT IDENTIFIES THE “ROOT CAUSE” OF PROBLEMS OR EVENTS INSTEAD OF SIMPLY ADDRESSING THE OBVIOUS SYMPTOMS.
Root Cause Analysis
• Aims to find various modes of failure within a system and addresses the following questions for execution:
• What is the mode in which an observed failure occurs?
• How many times does a cause of failure occur?
• What actions are implemented to prevent this cause from occurring again?
• Are these actions effective and efficient?
Failure Mode and Effects Analysis (FMEA)
• Uses the Pareto principle (80/20 rule) which states that roughly 80% of the effects come from 20% of the causes
Pareto Analysis
•
•
•
Used when there are multiple potential causes to a problem
Created using Excel software
Lays down potential causes in a bar graph
Tracks the collective percentage in a line graph to the top of the table
Pareto Analysis
is a form of algebra which is centered around three simple words known as Boolean Operators: “Or,” “And,” and “Not”.
Boolean Logic
• Uses Boolean logic to determine the root causes of an undesirable event
Fault Tree Analysis
• Idea that all values are either True or
False
• Used in risk and safety analysis
A single undesirable event is listed at the top the tree
• All potential causes are listed down to form the shape of an upside down tree
Fault Tree Analysis
Used when the root causes of multiple problems need to be analyzed all at once
Problems are listed first followed by the potential cause for a problem (Undesirable Effects or UDEs)
By doing so, a cause common to all problems will appear
Current Reality Tree (CRT)
• Also called the Ishikawa or cause-and-effect diagram
• Shows the categorized causes and sub-causes of a problem
• Useful in grouping causes (e.g. people, measurements, methods, materials, environment, machines) into categories
Fishbone Diagram
• Breaks a problem down to its root cause by assessing a situation using priorities and orders of concern for specific issues
• Various decisions to address problem are outlined to ensure that actions recommended are sustainable
Kepner-Tregoe Technique
• Diagnoses the causes of recurrent problems by following the 3 phases:
a. ________- data gathering and analysis of findings
b._________ creation of diagnostic plan and identification of the root cause through careful analysis
c.______ - fixing the problem and monitoring to confirm and validate if the root cause was correctly identified
Rapid Problem Resolution (RPR Problem Diagnosis)
Discover
Investigate
Fix