M6 U3 - Data Management - Q2 Flashcards
What phases are involved in data understanding? (2)
- data acquisition (aka data gathering)
- data preparation
What is data acquisition?
Also known as data gathering, it involves gathering data from different sources and transforming the data into formats that are suitable for analytic solution development.
What happens when the requirements phase is completed?
the data science team will embark on data acquisition or data gathering
What’s data wrangling? (3 actions, 3 results)
The process of gathering, selecting, and transforming data to ensure that it is usable, free of noise and has as little bias as possible to meet defined analytic objectives.
What steps are involved in data wrangling? (3)
- Checking for missing values
- Identifying outliers
- Formatting the data.
What is data management?
- It’s an organization’s way of ________________ (4) data.
Makes sure that the data housed within an organization is ______________ (2)
- It’s an organization’s way of acquiring, storing, securing and processing data.
- Makes sure that the data housed within an organization is accessible and accurate
What group(s) manage data management?
- Managed by the IT team in an organization.
- Business users will participate too
List the organization responsible from the 11 knowledge areas for data management. List the areas.
- Data management body of knowledge (DAMA-DMBOK)

Who is involved in the data management process?
- Multiple departments
Who’s responsible for designing an organization’s data management framework?
data architects
Data Governance
- Defines how data is accessed and managed within an organization.
- planning, oversight, and control over management of data and the use of data and data-related resources.
Data Architecture
the overall structure of data and data-related resources as an integral part of the enterprise architecture
Data Modeling & Design
analysis, design, building, testing, and maintenance (was Data Development in the DAMA-DMBOK 1st edition)
Data Storage & Operations
structured physical data assets storage deployment and management (was Data Operations in the DAMA-DMBOK 1st edition)
Data Security
ensuring privacy, confidentiality and appropriate access
Data Integration & Interoperability
acquisition, extraction, transformation, movement, delivery, replication, federation, virtualization and operational support ( a Knowledge Area new in DMBOK2)
Documents & Content
storing, protecting, indexing, and enabling access to data found in unstructured sources (electronic files and physical records), and making this data available for integration and interoperability with structured (database) data.
Describe Reference & Master Data (idea (1), how (2), and results (2))
Idea: Managing shared data
How:
- Standardizing data definitions
- Standardizing the use of data values
Results:
- Reduce redundancy
- Ensure better data quality
Data Warehousing & Business Intelligence
managing analytical data processing and enabling access to decision support data for reporting and analysis .
Metadata
collecting, categorizing, maintaining, integrating, controlling, managing, and delivering metadata .
The Data Quality area involves what? (3 actions on 1, 1 action on another)
- Defining, monitoring, maintaining data integrity
- Improving data quality.
Your client has an established data management structure in place, this means that:
Your client regards their data as a resource that should be reliable, and should be kept secure.
Correct: Data management ensures that a company is mindful of the security, integrity, and overall quality of their data and data infrastructure.
When setting analytic objectives, it is good practice to define a data statement. This ensures that you have assessed a business’s data science readiness. Why is data gathering not conducted during the data science readiness assessment?
Data gathering is best conducted after all analytic requirements are gathered.
Correct: Although analytic objectives have been set prior to this stage, it is important to define the analytic and business requirements to ensure that the right questions are answered and the right data is gathered.
What does data governance impact?
- High level: Impacts the decisions that can be made from the data available to the organization.
- Has positive implications for the: quality, security, and integrity of data