Chapter 10 Flashcards
What is data governance?
- High-level organizational groups and processes overseeing data stewardship across the organization
What is a data steward?
A person responsible for ensuring that organizational applications properly support the organization’s data quality goals
What are the requirements for data governance to be successful?
- Sponsorship from both senior management and business units
- A data steward manager to support, train, and coordinate data stewards
- Data stewards for different business units, subjects, and/or source systems
- A governance committee to provide data management guidelines and standards
Why is data quality important?
If the data are bad, the business fails. Period.
- GIGO - Garbage in, garbage out
- Sarbanes-Oxley (SOX) compliance by law sets data and metadata quality standards
What is the purpose of data quality?
- Minimize IT project risk
- Make timely business decisions
- Ensure regulatory compliance
- Expand customer base
What are the characteristics of quality data?
- Uniqueness
- Accuracy
- Consistency
- Completeness
- Timeliness
- Currency
- Conformance
- Referential Integrity
What are some causes of poor data quality?
- External data sources (Lack of control over data quality)
- Redundant data storage and inconsistent metadata (Proliferation of databases with uncontrolled redundancy and metadata)
- Data entry (Poor data capture controls)
- Lack of organizational commitment (Not recognizing poor data quality as an organizational issue)
What are some steps that can be taken to improve data quality?
- Get business buy-in
- Perform data quality audit
- Establish data stewardship program
- Improve data capture processes
- Apply modern data management principles and technology
- Apply total quality management (TQM) practices
How can you create business buy-in?
- Executive sponsorship
- Building a business case
- Prove a return on investment (ROI)
- Avoidence of cost
- Avoidance of opportunity loss
What do you do in a data quality audit?
- Statistically profile all data files
- Document the set of values for all fields
- Analyze data patterns (distribution, outliers, frequencies)
- Verify whether controls and business rules are enforced
- Use specialized data profiling tools
What are the roles of a data steward?
- Oversight of data stewardship program
- Manage data subject area
- Oversee data definitions
- Oversee production of data
- Oversee use of data
How can you improve data capture processes?
- Automate data entry as much as possible
- Manual data entry should be selected from preset options
- Use trained operators when possible
- Follow good user interface design principles
- Immediate data validation for entered data
What are some software tools for analyzing and correcting data quality problems?
- Pattern matching
- Fuzzy logic
- Expert systems
Besides software tools, what other modern tools can be applied to data management?
- Sound data modeling and database design
What does TQM stand for?
Total Quality Management
What are the TQM Principles?
- Defect prevention
- Continuous Improvement
- Use of enterprise data standards
What are the components of a balanced focus?
- Customer
- Product/Service
- Strong foundation of measurement
What is master data management (MDM)?
Disciplines, technologies, and methods to ensure the currency, meaning, and quality of reference data within and across various subject areas
What are the three main architectures of MDM?
- Identity registry
- Integration hub
- Persistent
What is Identity registry in MDM?
Master data remains in source systems; registry provides applications with location
What is an integration hub in MDM?
Data changes broadcast through central service to subscribing databases
What is persistent in MDM?
Central “golden record” maintained; all applications have access. Requires applications to push data. Prone to data duplication.
What does data integration do?
Creates a unified view of business data
Other possibilities:
- Application integration
- Business process integration
- User interaction integration
In data integration, what does any approach require?
Change data capture (CDC)