8 - Data quality Flashcards
What is data quality?
A measure of how well data represents real-world phenomena for business purposes
The four dimensions of quality are accuracy, validity, accessibility, and timeliness.
What are the four dimensions of data quality?
- Accurate
- Valid
- Accessible
- Timely
Why is data quality important for businesses?
Improves effectiveness and trust in data, leading to better decision-making and opportunities
Poor data quality can result in significant financial losses, as illustrated by past business failures.
What can poor data quality lead to in a business?
- Missed opportunities
- Poor decision-making
- Increased complaints
- Regulatory issues
What was one major consequence for British Gas due to poor data quality?
The company wrote off £200 million in 2008 due to customer complaints and lost a million customers
Complaints primarily involved billing issues.
What is a potential risk of poor GDPR compliance?
Having multiple records of the same customer, leading to incomplete data deletion requests.
What does Principle (d) of GDPR state?
You should ensure personal data held is not incorrect or misleading
This principle emphasizes the importance of data accuracy.
What is one example of a failure due to poor data quality?
Marketing targeting the wrong customers, leading to low response rates
This can result in wasted resources and missed revenue opportunities.
What is the first step to improve data quality in a project?
Convincing the business that improving data quality is important and useful.
What are some benefits of high-quality data?
- Creates efficiency
- Eliminates errors
- Improves decision-making
- Enhances security
- Provides quality reporting
- Facilitates linking and sharing
- Allows honest appraisal
- Meets legal obligations
- Measures performance
- Controls budgets
What does the availability of data mean in the context of data quality?
Data users need relevant data to make decisions, which should be accessible as soon as it becomes available.
What is meant by the timeliness of data?
Data should be captured and available quickly enough to support effective performance management.
How can accuracy of data be achieved?
By capturing data as close to the point of service delivery as possible.
What does ‘COUNT’ stand for in the context of data accuracy?
Collect Once, Use Numerous Times.
Fill in the blank: Poor-quality data means that a business will miss potential opportunities to _______.
[grow]
True or False: Poor data quality can lead to prosecution.
True
What is a common issue with small cohorts of customer data?
They may be unbalanced and not representative of the larger population.
What is a potential consequence of complex or irrelevant performance indicators?
They may be misunderstood or misreported.
What can a limited data quality audit provide?
A useful quick win that can lead to more strategic initiatives.
What was discovered about a marketing list that had never been investigated?
20 percent of the customers were deceased, leading to wasted marketing resources.
What is essential for ensuring data quality across different business units?
Recognizing that information requirements vary, but the need for good-quality data does not.
What is the relationship between data consistency and real-world processes?
Data that is consistent is more likely to reflect the real-world process that generates it, and so can be used with higher confidence when you make decisions.
What must be balanced with the importance of data uses?
The costs and effort of collecting it.
Why is it important for users of data to know about compromises in data accuracy?
So they don’t assume that accuracy is greater than it is.
What critical data is needed for businesses regarding customer age?
Date of birth to calculate age accurately.
What does validity of data refer to?
Recording and reporting data in ways that comply with compliance requirements or match internal standards.
How can organizations ensure data validity?
Through data governance policies.
What practical consideration must be taken when capturing data?
The method of acquisition, such as the document or system used.
What is a data quality strategy?
A plan to improve data quality by creating systems that ensure checks, validation, and automation.
What should a data quality strategy account for?
Different opinions depending on geographical location or business unit.
What might sales teams prioritize in data capture?
Only the data needed to make the sale.
What can regulation, like GDPR, limit?
What data can be collected and stored.
Why is it important for the entire business to understand data quality?
To meet targets and improve overall business performance.
What is necessary to measure before improving data quality?
Establishing a baseline of current data quality.
What critical data items should be focused on for customer data?
- Name
- Address
- Email address
- Telephone number
What does a quality score of 3 indicate about data?
The data is of high quality and meets the needs of all users.
What does a quality score of 1 indicate about data?
The data is widely known to be inaccurate and not trusted for decision-making.
What is the role of a data quality leader?
To analyze data quality and make recommendations for improvements.
What does the data quality improvement team do?
Identifies challenges damaging data quality and analyzes statistics on data quality.
What is data cleansing?
Removing obvious typing or cut-and-paste errors from data.
What is the purpose of a data quality audit?
To measure data quality using specific metrics.
What is a useful device for communicating data quality success?
A dashboard.
What traffic-light measure can be used for data fields?
- Red for less than 40% correct data
- Amber for 40% to 70%
- Green for greater than 70%
What should targets for data quality improvement be?
Stretch but achievable.
What is the purpose of external data in customer communication?
To determine if customers are unresponsive due to address changes, lack of replies, or passing away
External data helps clarify customer engagement issues.
What is data cleansing?
A quick fix to remove obvious errors, enhancing data quality
It involves correcting typing or cut-and-paste errors.
What can be captured from customers to improve data quality?
Email addresses, communication preferences, and relevant details
Contacting customers can provide valuable data insights.
What is the task for improving long-term data quality?
Address the processes that create poor-quality data
Continuous improvement is needed for data creation processes.
What are common reasons employees may not capture high-quality data?
Lack of training and understanding of data usage
Employees may view data capture as an inconvenience.
What is often not formalized as part of job descriptions regarding data?
Ensuring data quality
There is typically no recognition for data accuracy efforts.
What is a consequence of poor communication about data quality?
Continued capture of poor-quality data
Employees may not understand the importance of accurate data.
What is essential for machine learning and AI success?
High-quality data for training
Quality of data significantly impacts machine learning outcomes.
What are simple fixes to improve data entry accuracy?
- More data validation
- Address lookups using postcodes
Encouraging customers to enter their own data can also enhance accuracy.
What are the four dimensions of data quality?
- Accuracy
- Validity
- Timeliness
- Accessibility
Measuring these dimensions is crucial for assessing data quality.
What are the costs associated with poor-quality data?
Underperformance and elevated risk
The impact of poor-quality data may not be immediately apparent.
What can be a more effective approach to gain budget for data quality projects?
Selling the positive benefits of high-quality data
Demonstrating achievable goals with quality data can attract attention.
Fill in the blank: Data quality improvement needs its own _______.
[project]
Governance improvements alone are insufficient.
What is a significant data error example mentioned?
Deutsche Bank accidentally transferred $35 billion to an outside account in 2018
This highlights the risks of data errors.
What should organizations aim for regarding data acquisition processes?
Higher-quality acquisition processes for long-term improvement
These processes should be sustainable and beneficial.