CC6 - midterms Flashcards

1
Q

is the development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles.

dm

A

Data Management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

is any person who works in any aspect of data management (from technical management of data throughout its lifecycle to ensuring that data is properly utilized and leveraged) to meet strategic organizational goals. Data management professionals fill numerous roles, from the highly technical (e.g., database administrators, network administrators, programmers) to strategic business (e.g., Data Stewards, Data Strategists, Chief Data Officers).
- fill numerous roles, from highly technical (e.g., database administrators, network administrators, programmers) to strategic business roles (e.g., Data Stewards, Data Strategists, Chief Data Officers).

dmp

A

Data Management Professional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

are not just assets in the sense that organizations invest in them in order to derive future value. They are also vital to the day-to-day operations of most organizations. They have been called the ‘currency’, the 'life blood’, and even the ‘new oil’ of the information economy. Whether or not an organization gets value from its analytics, it cannot even transact business without data.

di

A

Data and information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  • In relation to information technology, it is also understood as information that has been stored in digital form (though data is not limited to information that has been digitized and data management principles apply to data captured on paper as well as in databases). Still, because today we can capture so much information electronically, we call many things ______ that would not have been called ______ in earlier times– things like names, addresses, birthdates, what one ate for dinner on Saturday, the most recent book one purchased.

d

A

1. Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  • the “raw material of information
  • data in context

di

A

2. Data and Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  • An asset is an economic resource, that can be owned or controlled, and that holds or produces value. Assets can be converted to money. Data is widely recognized as an enterprise asset, though understanding of what it means to manage data as an asset is still evolving.

daaoa

A

3. Data as an Organizational Asset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  • Data management shares characteristics with other forms of asset management, it involves knowing what data an organization has and what might be accomplished with it, then determining how best to use data assets to reach organizational goals. This balance can best be struck by following a set of principles that recognize salient features of data management and guide data management practice.

dmp

A

4. Data Management Principles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Because data management has distinct characteristics derived from the properties of data itself, it also presents challenges in following these principles.

dmc

A

5. Data Management Challenges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Physical assets can be pointed to, touched, and moved around. They can be in only one place at a time.
Financial assets must be accounted for on a balance sheet. However, data is different.
Data is not tangible. Yet it is durable; it does not wear out, though the value of data often changes as it ages. Data is easy to copy and transport. But it is not easy to reproduce if itis lost or destroyed.

ddfoa

A

5.1. Data Differs from Other Assets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Value is the difference between the cost of a thing and the benefit derived from that thing. For some assets, like stock, calculating value is easy. It is the difference between what the stock cost when it was purchased and what it was sold for. But for data, these calculations are more complicated, because neither the costs nor the benefits of data are standardized.

dv

A

5.2. Data Valuation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data is not tangible. Yet it is durable; it does not wear out, though the value of data often changes as it ages. Data is easy to copy and transport. But it is not easy to reproduce if itis lost or destroyed.

dq

A

5.3. Data Quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Deriving value from data does not happen by accident. It requires planning in many forms. It starts with the recognition that organizations can control how they obtain and create data. If they view data as a product that they create, they will make better decisions about it throughout its lifecycle.

pfbd

A

5.4. Planning for Better Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Management Metadata describes what data an organization has, what it represents, how it is classified, where it came from, how it moves within the organization, how it evolves through use, who can and cannot use it, and whether it is of high quality. Data is abstract. Definitions and other descriptions of context enable it to be understood. They make data, the data lifecycle, and the complex systems that contain data comprehensible

md

A

5.5. Metadata and Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data management is a complex process. Data is managed in different places within an organization by teams that have responsibility for different phases of the data lifecycle. Data management requires design skills to plan for systems, highly technical skills to administer hardware and build software, data analysis skills to understand issues and problems, analytic skills to interpret data, language skills to bring consensus to definitions and models, as well as strategic thinking to see opportunities to serve customers and meet goals.

dmicf

A

5.6. Data Management is Cross-functional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Managing data requires understanding the scope and range of data within an organization. Data is one of the ‘horizontals’ of an organization. It moves across verticals, such as sales, marketing, and operations.

eaep

A

5.7. Establishing an Enterprise Perspective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Today’s organizations use data that they create internally, as well as data that they acquire from external sources. They have to account for different legal and compliance requirements across national and industry lines.

afop

A

5.8. Accounting for Other Perspectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Like other assets, data has a lifecycle. To effectively manage data assets, organizations need to understand and plan for the data lifecycle. Well-managed data is managed strategically, with a vision of how the organization will use its data.

dl

A

5.9. The Data Lifecycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Managing data is made more complicated by the fact that there are different types of data that have different lifecycle management requirements. Any management system needs to classify the objects that are managed.

dtd

A

5.10. Different Types of Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data not only represents value, it also represents risk. Low quality data (inaccurate, incomplete, or out-of-date) obviously represents risk because its information is not right. But data is also risky because it can be misunderstood and misused.

dr

A

5.11. Data and Risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Data management activities are wide-ranging and require both technical and business skills. Because almost all of today’s data is stored electronically, data management tactics are strongly influenced by technology. From its inception, the concept of data management has been deeply intertwined with management of technology.

dmt

A

5.12. Data Management and Technology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The Leader’s Data Manifesto (2017) recognized that an “organization’s best opportunities for organic growth lie in data.” Although most organizations recognize their data as an asset, they are far from being data-driven.

edmrlc

A

5.13. Effective Data Management Requires Leadership and Commitment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

a set of choices and decisions that together chart a high-level course of action to achieve high-level goals. In the game of chess, a strategy is a sequenced set of moves to win by checkmate or to survive by stalemate.
A strategic plan is a high-level course of action to achieve high-level goals. Typically, a data strategy requires a supporting Data Management program strategy – a plan for maintaining and improving the quality of data, data integrity, access, and security while mitigating known and implied risks. The strategy must also address known challenges related to data management.

dms

A

6. Data Management Strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is a high-level course of action to achieve high-level goals. Typically, a data strategy requires a supporting Data Management program strategy

sp

A

strategic plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

– a plan for maintaining and improving the quality of data, data integrity, access, and security while mitigating known and implied risks. The strategy must also address known challenges related to data management.

dmps

A

Data Management program strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

is a formal document that outlines an organization's principles, guidelines, and framework for managing its data, defining roles, responsibilities, and processes to ensure data quality, security, compliance, and accessibility across the entire data lifecycle, aligning with the organization’s overall strategy and goals.

dmc

A

A Data Management Charter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q
  • is a document that clearly defines the boundaries and parameters of a data management project, outlining what data will be included, the processes to be implemented, the expected deliverables, and any limitations or exclusions, ensuring all stakeholders have a shared understanding of what is and is not included within the project scope.

dmss

A

data management scope statement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q
  • outlines a structured plan for an organization to effectively manage its data, including key phases like data assessment, governance establishment, data quality improvement, integration, storage, and security measures, with defined timelines and responsible parties to achieve optimal data utilization for informed decision-making.

dmir

A

Data Management Implementation Roadmap

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q
  • Data management involves a set of interdependent functions, each with its own goals, activities, and responsibilities.
  • Data management professionals must balance strategic and operational goals, business and technical requirements, risk and compliance, and various interpretations of data quality.
  • Different frameworks provide different perspectives to approach data management, clarifying strategy, roadmaps, team organization, and function alignment.

dmf

A

DATA MANAGEMENT FRAMEWORKS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q
  • Developed by Henderson and Venkatraman (1999).
  • Focuses on the relationship between data and information within an organization.
  • Information is associated with business strategy and operational use of data.
  • Data is linked to IT processes that support physical data management and accessibility.

sam

A

1. Strategic Alignment Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q
  • Developed by Abcouwer, Maes, and Truijens (1997).
  • Also called the 9-cell model.
  • Recognizes a middle layer between business and IT that focuses on planning and architecture.
  • Helps align data management strategies with an organization’s tactical and operational needs.

aim

A

2. The Amsterdam Information Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

expands on data management by defining Knowledge Areas that make up the scope of data management.

ddf

A

The DAMA-DMBOK Framework

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Three key visual representations describe DAMA-DMBOK Framework:

A

a. The DAMA Wheel – Places Data Governance at the center, surrounded by other Knowledge Areas (Data Architecture, Data Modeling, Data Quality, etc.).
b. The Environmental Factors Hexagon – Shows how people, processes, and technology interact.
c. The Knowledge Area Context Diagram – Details data management activities and their relationships using the SIPOC (Suppliers, Inputs, Processes, Outputs, Consumers) approach.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

– Places Data Governance at the center, surrounded by other Knowledge Areas (Data Architecture, Data Modeling, Data Quality, etc.).

dw

A

The DAMA Wheel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

– Shows how people, processes, and technology interact.

efh

A

The Environmental Factors Hexagon

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

– Details data management activities and their relationships using the SIPOC (Suppliers, Inputs, Processes, Outputs, Consumers) approach.

kacd

A

The Knowledge Area Context Diagram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q
  • Describes how organizations evolve in data management.
  • Outlines four phases for improving data maturity:
    Phase 1: The organization implements basic database capabilities through applications.
    Phase 2: They address data quality challenges, focusing on Metadata and Data Architecture.
    Phase 3: Establish Data Governance to structure and support data management.
    Phase 4: Organizations leverage well-managed data for analytics and business intelligence.

dp

A

4. The DMBOK Pyramid (Aiken)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q
  • Another variation of the DAMA framework developed by Sue Geuens.
  • Highlights the dependencies between data management functions.
  • Shows that Business Intelligence and Analytics rely on all other Knowledge Areas (Data Architecture, Data Quality, Data Integration, etc.).
  • Positions Data Governance as essential for ensuring organizations extract value from their data

ddmfe

A

5. DAMA Data Management Framework Evolved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

DAMA AND THE DMBOK

A
  • DAMA (Data Management Association International) was founded to address data management challenges.
  • The DMBOK (Data Management Body of Knowledge) serves as an authoritative reference for data management professionals.
    Purpose of the DMBOK:
  • Provides a functional framework for enterprise data management practices.
  • Establishes a common vocabulary for data management concepts.
  • Serves as the fundamental reference for the CDMP (Certified Data Management Professional) exam.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q
  • was founded to address data management challenges.

d….

A

DAMA (Data Management Association International)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q
  • serves as an authoritative reference for data management professionals.

d….

A

DMBOK (Data Management Body of Knowledge)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q
  • describes the central role that data ethics plays in making informed, socially responsible decisions about data and its uses. Awareness of the ethics of data collection, analysis, and use should guide all data management professionals.

dhe

A

Data Handling Ethics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q
  • describes the technologies and business processes that emerge as our ability to collect and analyze large and diverse data sets increases.

bdds

A

Big Data and Data Science

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q
  • outlines an approach to evaluating and improving an organization’s data management capabilities.

dmma

A

Data Management Maturity Assessment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q
  • provide best practices and considerations for organizing data management teams and enabling successful data management practices.

dmore

A

Data Management Organization and Role Expectations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q
  • describes how to plan for and successfully move through the cultural changes that are necessary to embed effective data management practices within an organization.

dmocm

A

Data Management and Organizational Change Management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

manages computer databases. The role may include capacity planning, installation, configuration, database design, migration, performance monitoring, security, troubleshooting, as well as backup and data recovery.

da

A

Database administrator (DBA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

is responsible for maintaining an organization's computer networks, including hardware and software. They ensure that networks are secure, efficient, and reliable.

na

A

network administrator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

are data governance employees who collect and maintain data for the organizations they work for while also protecting their data assets.

ds

A

Data stewards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

is a leader who uses data to help a company make strategic decisions. They are responsible for integrating data from various sources to create a unified view.

ds

A

data strategist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

is a senior executive who manages a company's data strategy and use. They are responsible for ensuring that data is used effectively to support business decisions.

cdo

A

Chief Data Officer (CDO)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q
  • Ensure that data is accurate and of good quality

dq

A

Data quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q
  • Protect data from unauthorized access, theft, or corruption

ds

A

Data security

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q
  • Manage data governance strategies, practices, and requirements

dg

A

Data governance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q
  • Lead the development of a data strategy that aligns with business objectives

ds

A

Data strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q
  • Implement data analytics into business processes

da

A

Data analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q
  • Promote data literacy and a data-driven culture

dl

A

Data literacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q
  • Ensure compliance with data protection and privacy regulations

rc

A

Regulatory compliance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Some examples of basic metadata are:

A
  • author
  • date created
  • date modified
  • file size.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

is also used for unstructured data such as images, video, web pages, spreadsheets, etc. Web pages often include metadata in the form of meta tags.

m

A

Metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

: Identify key business goals that data can support and prioritize data needs aligned with strategic initiatives.

dbo

A

Define Business Objectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

: Conduct a comprehensive data audit to identify all data sources, their formats, locations, quality, and usage across the organization.

dim

A

Data Inventory and Mapping

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

: Assess data accuracy, completeness, consistency, and relevance to identify areas for improvement.

dqa

A

Data Quality Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

: Identify key stakeholders, their data requirements, and establish communication channels.

se

A

Stakeholder Engagement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

: Develop clear guidelines for data ownership, access control, data quality standards, retention policies, and privacy compliance.

edgp

A

Establish Data Governance Policies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

: Assign data stewards, data owners, and data custodians with defined accountability for data management.

dgrr

A

Data Governance Roles and Responsibilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

: Implement processes to monitor and improve data quality through data cleansing, validation, and standardization.

dqmp

A

Data Quality Management Plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

: Determine which data sources are critical for integration and prioritize based on business needs.

dss

A

Data Source Selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

: Map data elements from different sources to a unified schema and transform data to ensure consistency.

dmt

A

Data Mapping and Transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

: Select appropriate data integration tools to extract, transform, and load (ETL) data from disparate sources.

dit

A

Data Integration Tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

: Choose appropriate data storage architecture (relational, dimensional, cloud-based) to facilitate analysis and reporting.

dw/ld

A

Data Warehouse/Lake Design

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

: Implement robust data security controls including encryption, access controls, and data masking to protect sensitive information.

dsm

A

Data Security Measures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

: Establish a reliable data backup and disaster recovery plan to mitigate data loss risks.

dbrs

A

Data Backup and Recovery Strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

: Choose appropriate BI tools to visualize and analyze data for decision-making.

bits

A

Business Intelligence (BI) Tool Selection

74
Q

: Create customized dashboards and reports aligned with key business metrics to provide actionable insights.

dd

A

Dashboard Development

75
Q

: Develop data models to enable efficient querying and analysis of data across different dimensions.

dma

A

Data Modeling and Analysis

76
Q

: Regularly monitor data quality metrics to identify and address data quality issues proactively.

dqm

A

Data Quality Monitoring

77
Q

: Track key performance indicators (KPIs) related to data management to assess the effectiveness of implemented strategies.

pe

A

Performance Evaluation

78
Q

: Review and update the data management strategy as business needs evolve and new technologies emerge.

ac

A

Adapting to Change

79
Q

: Ensure strong support from leadership and involve key stakeholders throughout the implementation process.

oa

A

Organizational Alignment

80
Q

: Communicate changes effectively and provide training to users to facilitate adoption of new data management practices.

cm

A

Change Management

81
Q

: Adhere to relevant data privacy regulations (e.g., GDPR, CCPA) when managing sensitive data.

cr

A

Compliance Requirements

82
Q

describe the purpose the Knowledge Area and the fundamental principles that guide performance of activities within each Knowledge Area.

g

83
Q

are the actions and tasks required to meet the goals of the Knowledge Area. Some activities are described in terms of sub-activities, tasks, and steps.

a

A

Activities

84
Q
  • set the strategic and tactical course for meeting data management goals. it s occur on a recurring basis.

pa

A

(P)Planning Activities

85
Q
  • are organized around the system development lifecycle (SDLC) (analysis, design, build, test, preparation, and deployment).

da

A

(D)Development Activities

86
Q
  • ensure the ongoing quality of data and the integrity, reliability, and security of systems through which data is accessed and used.

ca

A

(C) Control Activities

87
Q
  • support the use, maintenance, and enhancement of systems and processes through which data is accessed and used.

oa

A

(O)Operational Activities

88
Q

are the tangible things that each Knowledge Area requires to initiate its activities. Many activities require the same inputs. For example, many require knowledge of the Business Strategy as input.

i

89
Q

are the outputs of the activities within the Knowledge Area, the tangible things that each function is responsible for producing. Deliverables may be ends in themselves or inputs into other activities. Several primary deliverables are created by multiple functions.

d

A

Deliverables

90
Q

describe how individuals and teams contribute to activities within the Knowledge Area. Roles are described conceptually, with a focus on groups of roles required in most organizations. Roles for individuals are defined in terms of skills and qualification requirements. Skills Framework for the Information Age (SFIA) was used to help align role titles. Many roles will be cross-functional.

rr

A

Roles and Responsibilities

91
Q

are the people responsible for providing or enabling access to inputs for the activities.

s

92
Q

those that directly benefit from the primary deliverables created by the data management activities.

c

93
Q

are the people that perform, manage the performance of, or approve the activities in the Knowledge Area.

p

A

Participants

94
Q

are the applications and other technologies that enable the goals of the Knowledge Area.

t

95
Q

are the methods and procedures used to perform activities and produce deliverables within a Knowledge Area. Techniques include common conventions, best practice recommendations, standards and protocols, and, where applicable, emerging alternative approaches.

t

A

Techniques

96
Q

are standards for measurement or evaluation of performance, progress, quality, efficiency, or other effect. The metrics sections identify measurable facets of the work that is done within each Knowledge Area. Metrics may also measure more abstract characteristics, like improvement or value

m

97
Q

is the process of discovering, analyzing, and scoping data requirements, and then representing and communicating these data requirements in a precise form called the data model. Data modeling is a critical component of data management.

dm

A

Data modeling

98
Q

is answering the question of “how”
* How the data will be gathered
* How the data will analysed
* How the data requirements will be grouped depending on their subset
* After that processes makakabuo na ng data model by communicating the data requirements.

dm

A

Data modeling

99
Q

are critical to effective management of data. They:

Provide a common vocabulary around data
Capture and document explicit knowledge about an organization’s data and systems
Serve as a primary communications tool during projects
Provide the starting point for customization, integration, or even replacement of an application

dm

A

Data models

100
Q

Goals and Principles

A

Confirming and documenting understanding of different perspectives facilitates:

Formalization: A data model documents a concise definition of data structures and relationships. It enables assessment of how data is affected by implemented business rules, for current as-is states or desired target states.

Scope definition: A data model can help explain the boundaries for data context and implementation of purchased application packages, projects, initiatives, or existing systems.

Knowledge retention/documentation: A data model can preserve corporate memory regarding a system or project by capturing knowledge in an explicit form. It serves as documentation for future projects to use as the as-is version.

101
Q

is most frequently performed in the context of systems development and maintenance efforts, known as the system development lifecycle (SDLC).

dm

A

Data modeling

102
Q

is a representation of something that exists or a pattern for something to be made. A model can contain one or more diagrams.

m

103
Q

describes an organization’s data as the organization understands it, or as the organization wants it to be. A data model contains a set of symbols with text labels that attempts visually to represent data requirements as communicated to the data modeler, for a specific set of data that can range in size from small, for a project, to large, for an organization.

dm

A

Data model

104
Q

: Data used to classify and assign types to things. For example, customers classified by market categories or business sectors; products classified by color, model, size, etc.; orders classified by whether they are open or closed.

ci

A

Category information

105
Q

: Basic profiles of resources needed conduct operational processes such as Product, Customer, Supplier, Facility, Organization, and Account.

ri

A

Resource information

106
Q

: Data created while operational processes are in progress. Examples include Customer Orders, Supplier Invoices, Cash Withdrawal, and Business Meetings.

bei

A

Business event information

107
Q

: is often produced through point-of-sale systems(either in stores or online).

dti

A

Detail transaction information

108
Q
  • is a thing about which an organization collects information.
  • sometimes referred to as the nouns of an organization.
  • can be thought of as the answer to a fundamental question – who, what, when, where, why, or how – or to a combination of these questions.

e

109
Q

are the occurrences or values of a particular entity

ei

A

Entity instances

110
Q

Entity -/ type, instance

A

Entity – Jane, Employee
Entity type – Employee
Entity instance – Jane

Entity – Raine, Lecturer
Entity type – Lecturer
Entity instance – Raine

111
Q

In _________ the term relationship is often used, _________________ the term navigation path is often used, and in _____________ terms such as **edge or link **are used, for example._______ can also vary based on level of detail. A relationship at the conceptual and logical levels is called a relationship, but a relationship at the physical level may be called by other names, such as constraint or reference, depending on the database technology.

rs ds ns ra

A

relational schemes
dimensional schemes
NoSQL schemes
Relationship aliases

112
Q

Relationships between two entities

c dr

A

Cardinality is represented by the symbols that appear on both ends of a relationship line.
Data rules are specified and enforced through cardinality.
* Without cardinality, the most one can say about a relationship is that two entities are connected in some way.

113
Q

The number of entities in a relationship is the __________________ of the relationship. The most common are unary, binary, and ternary relationships

A

‘arity’

114
Q

relationship involves only one entity. A one-to-many recursive relationship describes a hierarchy, whereas a many-to-many relationship describes a network or graph. In a hierarchy, an entity instance has at most one parent (or higher-level entity). In relational modeling, child entities are on the many side of the relationship, with parent entities on the one side of the relationship. Ina network, an entity instance can have more than one parent.

u

A

unary (also known as a recursive or self-referencing)

115
Q

An arity of two is also known as _____________. A binary relationship, the most common on a traditional data model diagram, involves two entities.

b

116
Q

An arity of three, known as ________, is a relationship that includes three entities. An example in fact-based modeling (object-role notation) appears in Figure 35. Here Student can register for a particular Course in a given Semester.

t

117
Q
  • is used in physical and sometimes logical relational data modelling schemes to represent a relationship.
  • may be created implicitly when a relationship is defined between two entities, depending on the database technology or data modeling tool, and whether the two entities involved have mutual dependencies.

fk

A

foreign key

118
Q

(also called a key) is a set of one or more attributes that uniquely defines an instance of an entity. This section defines types of keys by construction
(simple, compound, composite, surrogate)
and function
(candidate, primary, alternate).

i

A

identifier

119
Q

is one attribute that uniquely identifies an entity instance.
* Ex. Universal Product Codes (UPCs) and Vehicle Identification Numbers(VINs).

sk

A

simple key

120
Q
  • is also an example of a simple key.
  • is a unique identifier for a table. Often a counter and always system-generated without intelligence, a surrogate key is an integer whose meaning is unrelated to its face value.

sk

A

surrogate key

121
Q

is a set of two or more attributes that together uniquely identify an entity instance. Ex. Phone number (area code + exchange + local number).

ck

A

compound key

122
Q

contains one compound key and at least one other simple or compound key or non-key attribute.

ck

A

composite key

123
Q

A is any set of attributes that uniquely identify an entity instance.

sk

124
Q

A is a minimal set of one or more attributes (i.e., a simple or compound key) that identifies the entity instance to which it belongs.

ck

A

candidate key

125
Q

is one or more attributes that a business professional would use to retrieve a single entity instance.

bk

A

business key

126
Q

is the candidate key that is chosen to be the unique identifier for an entity.

pk

A

primary key

127
Q

can still be used to find specific entity instances. Often the primary key is a surrogate key and the ____________________ are business keys.

ak

A

alternate key

128
Q

is one where the primary key contains only attentityributes that belong to that entity.

ie

A

independent entity

129
Q

is one where the primary key contains at least one attribute from another entity.

de

A

dependent entity

130
Q

: Domains that specify the standard types of data one can have in an attribute assigned to that domain. For example, Integer, Character(30), and Date are all data type domains.

dt

131
Q

: Domains that use patterns including templates and masks, such as are found in postal codes and phone numbers, and character limitations (alphanumeric only, alphanumeric with certain special characters allowed, etc.) to define valid values.

df

A

Data Format

132
Q

: Domains that contain a finite set of values. These are familiar to many people from functionality like dropdown lists.
* For example, the list domain for OrderStatusCode can restrict values to only {Open, Shipped, Closed, Returned}.

l

133
Q

: Domains that allow all values of the same data type that are between one or more minimum and/or maximum values. Some ranges can be open-ended.
* For example, OrderDeliveryDate must be between OrderDate and three months in the future.

r

134
Q

: Domains defined by the rules that values must comply with in order to be valid. These include rules comparing values to calculated values or other attribute values in a relation or set.
* For example, ItemPrice must be greater than ItemCost.

rb

A

Rule-based

135
Q

The use of schemes depends in part on the database being built, as some are suited to particular technologies

dms

A

Data Model Schemes

136
Q

CDM, LDM, PDM

A
  • In aCDM, you can define data items and entity attributes. In aLDM, you can only define entity attributes.
  • In theCDM, the foreign attribute migration does not occur until you generate aLDMorPDM.
  • In theLDM, the foreign attribute migrates immediately.
  • Conceptual, logical, physical data models(PDM)
137
Q

First articulated by Dr. Edward Codd in 1970, _______________ provides a systematic way to organize data so that they reflected their meaning(Codd, 1970). This approach had the additional effect of reducing redundancy in data storage

rt

A

relational theory

138
Q

The concept of _________ started from a joint research project conducted by General Mills and Dartmouth College in the 1960’s. 33 In dimensional models, data is structured to optimize the query and analysis of large amounts of data. In contrast, operational systems that support transaction processing are optimized for fast processing of individual transactions.

dm

A

dimensional modeling

139
Q

The three main types of change are sometimes known by ORC.

A
  • Overwrite (Type 1): The new value overwrites the old value in place.
  • New Row (Type 2): The new values are written in a new row, and the old row is marked as not current.
  • New Column (Type 3): Multiple instances of a value are listed in columns on the same row, and a new value means writing the values in the series one spot down to make space at the front for the new value. The last value is discarded.
140
Q

is the term given to normalizing the flat, single-table, dimensional structure in a star schema into the respective component hierarchical or network structures.

A

Snowflaking

141
Q

stands for the meaning or description of a single row of data in a fact table; this is the most detail any row will have.

g

142
Q

are built with the entire organization in minD instead of just a particular project; this allows these dimensions to be shared across dimensional models, due to containing consistent terminology and values.

A

Conformed dimensions

143
Q

use standardized definitions of termS across individual marts. Different business users may use the same term in different ways.
‘Customer additions’ may be different from ‘gross additions’ or ‘adjusted additions.’

cf

A

Conformed facts

144
Q
  • is a graphical language for modeling software.
  • has a variety of notations of which one (the class model) concerns databases.
  • class model specifies classes(entity types) and their relationship types (Blaha, 2013).

uml

A

Unified Modeling Language (UML)

145
Q

has Operations or Methods (also called its “behavior”). Class behavior is only loosely connected to business logic because it still needs to be sequenced and timed. In ER terms, the table has stored procedures/triggers. Class Operations can be:

c

146
Q

, a family of conceptual modeling languages, originated in the late 1970s. Fact-based languages view the world in terms of objects, the facts that relate or characterize those objects, and each role that each object plays in each fact.

fbm

A

Fact-Based Modeling

147
Q

do not use attributes, reducing the need for intuitive or expert judgment by expressing the exact relationships between objects (both entities and values).

fbm

A

Fact-based models

148
Q

is a model-driven engineering approach that starts with typical examples of required information or queries presented in any external formulation familiar to users, and then verbalizes these examples at the conceptual level, in terms of simple facts expressed in a controlled natural language.

orm

A

Object-Role Modeling (ORM)

149
Q

is similar in notation and approach to ORM. The numbers in Figure 43 are references to verbalizations of facts.

fcom

A

Fully Communication Oriented Modeling (FCO-IM)

150
Q

are used when data values must be associated in chronological order and with specific time values.

tbp

A

Time-based patterns

151
Q

is a detail-oriented, time-based, and uniquely linked set of normalized tables that support one or more functional areas of business. Itis a hybrid approach, encompassing the best of breed between third normal form and star schema. Data Vaults are designed specifically to meet the needs of enterprise data warehouses.

dv

A

Data Vault

152
Q

is a technique suited for information that changes overtime in both structure and content. It provides graphical notation used for conceptual modeling similar to traditional data modeling, with extensions for working with temporal data.

am

A

Anchor Modeling

153
Q

is a name for the category of databases built on non-relational technology.

n

154
Q

Instead of taking a business subject and breaking it up into multiple relational structures, document databases frequently store the business subject in one structure called a ___________.

d

155
Q

databases allow an application to store its data in only two columns (‘key’ and ‘value’), with the feature of storing both simple (e.g., dates, numbers, codes) and complex information (unformatted text, video, music, documents, photos) stored within the ‘value’ column.

kv

156
Q

Out of the four types of NoSQL databases, _________________ is closest to the RDBMS. Both have a similar way of looking at data as rows and values.

co

A

column-oriented

157
Q

A____________ database is designed for data whose relations are well represented as a set of nodes with an undetermined number of connections between these nodes.

g

158
Q

: This embodies the ‘real world’ view of the enterprise being modeled in the database. It represents the current ‘best model’ or ‘way of doing business’ for the enterprise.

c

A

Conceptual

159
Q

: The various users of the database management system operate on subsets of the total enterprise model that are relevant to their particular needs. These subsets are represented as ‘external schemas’.

e

160
Q

: The ‘machine view’ of the data is described by the internal schema. This schema describes the stored representation of the enterprise’s information

i

161
Q

This section provides an overview of conceptual, logical, and physical data modeling.

dm

A

Data Model

162
Q

A ____________ captures the high-level data requirements as a collection of related concepts. It contains only the basic and critical business entities within a given realm and function, with a description of each entity and the relationships between entities.

cdm

A

conceptual data model

163
Q

A _____________________ is a detailed representation of data requirements, usually in support of a specific usage context, such as application requirements. Logical data models are still independent of any technology or specific implementation constraints.
* often begins as an extension of a conceptual data model

ldm

A

logical data model

164
Q

A is in many cases a fully-attributed perspective of the dimensional conceptual data model, as illustrated in Figure 49.

dldm

A

dimensional logical data model

165
Q

A ___________ represents a detailed technical solution, often using the logical data model as a starting point and then adapted to work within a set of hardware, software, and network tools. Physical data models are built for a particular technology.

pdm

A

physical data model (PDM)

166
Q

A variant of a physical scheme is a , used for data in motion between systems.
This model describes the structure of data being passed between systems as packets or messages. When sending data through web services, an Enterprise Service Bus (ESB), or through Enterprise Application Integration (EAI), the canonical model describes what data structure the sending service and any receiving services should use.

cm

A

Canonical Model

167
Q

is a virtual table.
provide a means to look at data from one or many tables that contain or reference the actual attributes. A standard view runs SQL to retrieve data at the point when an attribute in the view is requested. An instantiated (often called ‘materialized’) view runs at a predetermined time. Views are used to simplify queries, control data access, and rename columns, without the redundancy and loss of referential integrity due to denormalization.

v

168
Q

refers to the process of splitting a table. It is performed to facilitate archiving and to improve retrieval performance

p

A

Partitioning

169
Q

Vertically vs. Horizontally split

A
  • Vertically split: To reduce query sets, create subset tables that contain subsets of columns.
    For example, split a customer table in two based on whether the fields are mostly static or mostly volatile (to improve load / index performance), or based on whether the fields are commonly or uncommonly included in queries (to improve table scan performance).
  • Horizontally split: To reduce query sets, create subset tables using the value of a column as the differentiator.
    For example, create regional customer tables that contain only customers in a specific region.
170
Q
  • is the deliberate transformation of normalized logical data model entities into physical tables with redundant or duplicate data structures. There are several reasons to denormalize data.
  • can also be used to enforce user security by segregating data into multiple views or copies of tables according to access needs. This process does introduce a risk of data errors due to duplication.

d

A

Denormalization

171
Q

In dimensional data modeling, is called collapsing or combining. If each dimension is collapsed into a single structure, the resulting data model is called a Star Schema (see Figure 51). If the dimensions are not collapsed, the resulting data model is called a Snowflake (See Figure 49).

d

A

denormalization

172
Q

is the process of applying rules in order to organize business complexity into stable data structures. The basic goal of normalization is to keep each attribute in only one place to eliminate redundancy and the inconsistencies that can result from redundancy.

n

A

Normalization

173
Q

: Ensures each entity has a valid primary key, and every attribute depends on the primary key; removes repeating groups, and ensures each attribute is atomic(not multi-valued). 1NF includes the resolution of many-to-many relationships with an additional entity often called an associative entity.

fnf

A

First normal form (1NF)

174
Q

: Ensures each entity has the minimal primary key and that every attribute depends on the complete primary key.

snf

A

Second normal form (2NF)

175
Q

: Ensures each entity has no hidden primary keys and that each attribute depends on no attributes outside the key (“the key, the whole key and nothing but the key”).

tnf

A

Third normal form (3NF)

176
Q

: Resolves overlapping composite candidate keys. A candidate key is either a primary or an alternate key. ‘Composite’ means more than one (i.e., two or more attributes in an entity’s primary or alternate keys), and ‘overlapping’ means there are hidden business rules between the keys.

b/cnf

A

Boyce / Codd normal form (BCNF)

177
Q

: Resolves all many-to-many-to-many relationships (and beyond) in pairs until they cannot bebroken down into any smaller pieces.

fnf

A

Fourth normal form (4NF)

178
Q

: Resolves inter-entity dependencies into basic pairs, and all join dependencies use parts of primary keys.

fnf

A

Fifth normal form (5NF)

179
Q

is the removal of details in such a way as to broaden applicability to a wide class of situations while preserving the important properties and essential nature from concepts or subjects.

includes generalization and specialization.
* Generalization groups the common attributes and relationships of entities into super type entities, while specialization separates distinguishing attributes within an entity into subtype entities.
* This specialization is usually based on attribute values within an entity instance.

a

A

Abstraction

180
Q

is the concept of exposing only the required essential characteristics and behavior with respect to a context.

a

A

Abstraction