CC6 - midterms Flashcards

1
Q

is the development, execution, and supervision of plans, policies, programs, and practices that deliver, control, protect, and enhance the value of data and information assets throughout their lifecycles.

dm

A

Data Management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

is any person who works in any aspect of data management (from technical management of data throughout its lifecycle to ensuring that data is properly utilized and leveraged) to meet strategic organizational goals. Data management professionals fill numerous roles, from the highly technical (e.g., database administrators, network administrators, programmers) to strategic business (e.g., Data Stewards, Data Strategists, Chief Data Officers).
- fill numerous roles, from highly technical (e.g., database administrators, network administrators, programmers) to strategic business roles (e.g., Data Stewards, Data Strategists, Chief Data Officers).

dmp

A

Data Management Professional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

are not just assets in the sense that organizations invest in them in order to derive future value. They are also vital to the day-to-day operations of most organizations. They have been called the ‘currency’, the 'life blood’, and even the ‘new oil’ of the information economy. Whether or not an organization gets value from its analytics, it cannot even transact business without data.

di

A

Data and information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  • In relation to information technology, it is also understood as information that has been stored in digital form (though data is not limited to information that has been digitized and data management principles apply to data captured on paper as well as in databases). Still, because today we can capture so much information electronically, we call many things ______ that would not have been called ______ in earlier times– things like names, addresses, birthdates, what one ate for dinner on Saturday, the most recent book one purchased.

d

A

1. Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  • the “raw material of information
  • data in context

di

A

2. Data and Information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  • An asset is an economic resource, that can be owned or controlled, and that holds or produces value. Assets can be converted to money. Data is widely recognized as an enterprise asset, though understanding of what it means to manage data as an asset is still evolving.

daaoa

A

3. Data as an Organizational Asset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  • Data management shares characteristics with other forms of asset management, it involves knowing what data an organization has and what might be accomplished with it, then determining how best to use data assets to reach organizational goals. This balance can best be struck by following a set of principles that recognize salient features of data management and guide data management practice.

dmp

A

4. Data Management Principles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Because data management has distinct characteristics derived from the properties of data itself, it also presents challenges in following these principles.

dmc

A

5. Data Management Challenges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Physical assets can be pointed to, touched, and moved around. They can be in only one place at a time.
Financial assets must be accounted for on a balance sheet. However, data is different.
Data is not tangible. Yet it is durable; it does not wear out, though the value of data often changes as it ages. Data is easy to copy and transport. But it is not easy to reproduce if itis lost or destroyed.

ddfoa

A

5.1. Data Differs from Other Assets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Value is the difference between the cost of a thing and the benefit derived from that thing. For some assets, like stock, calculating value is easy. It is the difference between what the stock cost when it was purchased and what it was sold for. But for data, these calculations are more complicated, because neither the costs nor the benefits of data are standardized.

dv

A

5.2. Data Valuation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data is not tangible. Yet it is durable; it does not wear out, though the value of data often changes as it ages. Data is easy to copy and transport. But it is not easy to reproduce if itis lost or destroyed.

dq

A

5.3. Data Quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Deriving value from data does not happen by accident. It requires planning in many forms. It starts with the recognition that organizations can control how they obtain and create data. If they view data as a product that they create, they will make better decisions about it throughout its lifecycle.

pfbd

A

5.4. Planning for Better Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Management Metadata describes what data an organization has, what it represents, how it is classified, where it came from, how it moves within the organization, how it evolves through use, who can and cannot use it, and whether it is of high quality. Data is abstract. Definitions and other descriptions of context enable it to be understood. They make data, the data lifecycle, and the complex systems that contain data comprehensible

md

A

5.5. Metadata and Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Data management is a complex process. Data is managed in different places within an organization by teams that have responsibility for different phases of the data lifecycle. Data management requires design skills to plan for systems, highly technical skills to administer hardware and build software, data analysis skills to understand issues and problems, analytic skills to interpret data, language skills to bring consensus to definitions and models, as well as strategic thinking to see opportunities to serve customers and meet goals.

dmicf

A

5.6. Data Management is Cross-functional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Managing data requires understanding the scope and range of data within an organization. Data is one of the ‘horizontals’ of an organization. It moves across verticals, such as sales, marketing, and operations.

eaep

A

5.7. Establishing an Enterprise Perspective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Today’s organizations use data that they create internally, as well as data that they acquire from external sources. They have to account for different legal and compliance requirements across national and industry lines.

afop

A

5.8. Accounting for Other Perspectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Like other assets, data has a lifecycle. To effectively manage data assets, organizations need to understand and plan for the data lifecycle. Well-managed data is managed strategically, with a vision of how the organization will use its data.

dl

A

5.9. The Data Lifecycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Managing data is made more complicated by the fact that there are different types of data that have different lifecycle management requirements. Any management system needs to classify the objects that are managed.

dtd

A

5.10. Different Types of Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Data not only represents value, it also represents risk. Low quality data (inaccurate, incomplete, or out-of-date) obviously represents risk because its information is not right. But data is also risky because it can be misunderstood and misused.

dr

A

5.11. Data and Risk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Data management activities are wide-ranging and require both technical and business skills. Because almost all of today’s data is stored electronically, data management tactics are strongly influenced by technology. From its inception, the concept of data management has been deeply intertwined with management of technology.

dmt

A

5.12. Data Management and Technology

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

The Leader’s Data Manifesto (2017) recognized that an “organization’s best opportunities for organic growth lie in data.” Although most organizations recognize their data as an asset, they are far from being data-driven.

edmrlc

A

5.13. Effective Data Management Requires Leadership and Commitment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

a set of choices and decisions that together chart a high-level course of action to achieve high-level goals. In the game of chess, a strategy is a sequenced set of moves to win by checkmate or to survive by stalemate.
A strategic plan is a high-level course of action to achieve high-level goals. Typically, a data strategy requires a supporting Data Management program strategy – a plan for maintaining and improving the quality of data, data integrity, access, and security while mitigating known and implied risks. The strategy must also address known challenges related to data management.

dms

A

6. Data Management Strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is a high-level course of action to achieve high-level goals. Typically, a data strategy requires a supporting Data Management program strategy

sp

A

strategic plan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

– a plan for maintaining and improving the quality of data, data integrity, access, and security while mitigating known and implied risks. The strategy must also address known challenges related to data management.

dmps

A

Data Management program strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
is a `formal document that outlines an organization's principles, guidelines, and framework` for managing its data, defining roles, responsibilities, and processes to ensure data quality, security, compliance, and accessibility across the entire data lifecycle, aligning with the organization's overall strategy and goals. ## Footnote dmc
**A Data Management Charter**
26
* is a `document that clearly defines the boundaries and parameters of a data management project`, outlining what data will be included, the processes to be implemented, the expected deliverables, and any limitations or exclusions, ensuring all stakeholders have a shared understanding of what is and is not included within the project scope. ## Footnote dmss
**data management scope statement**
27
* `outlines a structured plan for an organization to effectively manage its data`, including key phases like data assessment, governance establishment, data quality improvement, integration, storage, and security measures, with defined timelines and responsible parties to achieve optimal data utilization for informed decision-making. ## Footnote dmir
**Data Management Implementation Roadmap**
28
* Data management involves a set of `interdependent functions`, each with its own goals, activities, and responsibilities. * Data management professionals must balance `strategic and operational goals, business and technical requirements, risk and compliance`, and various interpretations of data quality. * Different frameworks provide `different perspectives` to approach data management, clarifying `strategy, roadmaps, team organization, and function alignment`. ## Footnote dmf
**DATA MANAGEMENT FRAMEWORKS**
29
* Developed by `Henderson and Venkatraman (1999)`. * Focuses on the `relationship between data and information` within an organization. * Information is associated with `business strategy and operational use of data`. * Data is linked to `IT processes` that support `physical data management and accessibility`. ## Footnote sam
**1. Strategic Alignment Model**
30
* Developed by `Abcouwer, Maes, and Truijens (1997)`. * Also called the `9-cell model`. * Recognizes a **`middle layer`** between business and IT that focuses on `planning and architecture`. * Helps align `data management strategies` with an organization’s `tactical and operational needs`. ## Footnote aim
**2. The Amsterdam Information Model**
31
`expands on data management` by defining Knowledge Areas that make up the scope of data management. ## Footnote ddf
**The DAMA-DMBOK Framework**
32
**Three key visual representations describe DAMA-DMBOK Framework:**
a. **The DAMA Wheel** – Places Data Governance at the center, surrounded by other Knowledge Areas (Data Architecture, Data Modeling, Data Quality, etc.). b. **The Environmental Factors Hexagon** – Shows how people, processes, and technology interact. c. **The Knowledge Area Context Diagram** – Details data management activities and their relationships using the SIPOC (Suppliers, Inputs, Processes, Outputs, Consumers) approach.
33
– Places Data Governance at the center, surrounded by other Knowledge Areas (Data Architecture, Data Modeling, Data Quality, etc.). ## Footnote dw
**The DAMA Wheel**
34
– Shows how people, processes, and technology interact. ## Footnote efh
**The Environmental Factors Hexagon**
35
– Details data management activities and their relationships using the SIPOC (Suppliers, Inputs, Processes, Outputs, Consumers) approach. ## Footnote kacd
**The Knowledge Area Context Diagram**
36
* Describes how organizations evolve in data management. * **Outlines four phases for improving data maturity:** **Phase 1**: The organization implements basic database capabilities through applications. **Phase 2**: They address data quality challenges, focusing on Metadata and Data Architecture. **Phase 3**: Establish Data Governance to structure and support data management. **Phase 4**: Organizations leverage well-managed data for analytics and business intelligence. ## Footnote dp
**4. The DMBOK Pyramid (Aiken)**
37
* Another variation of the `DAMA framework` developed by `Sue Geuens`. * Highlights the `dependencies between data management functions`. * Shows that `Business Intelligence and Analytics` rely on all `other Knowledge Areas` (Data Architecture, Data Quality, Data Integration, etc.). * Positions `Data Governance` as essential for ensuring organizations `extract value from their data` ## Footnote ddmfe
**5. DAMA Data Management Framework Evolved**
38
**DAMA AND THE DMBOK**
* **DAMA (Data Management Association International)** was founded to address data management challenges. * The **DMBOK (Data Management Body of Knowledge)** serves as an authoritative reference for data management professionals. **Purpose of the DMBOK:** * Provides a functional framework for enterprise data management practices. * Establishes a common vocabulary for data management concepts. * Serves as the fundamental reference for the CDMP (Certified Data Management Professional) exam.
39
* was founded to `address data management challenges`. ## Footnote d....
**DAMA (Data Management Association International)**
40
* serves as an `authoritative reference` for data management professionals. ## Footnote d....
**DMBOK (Data Management Body of Knowledge)**
41
* describes the `central role that data ethics` plays in making informed, socially responsible decisions about data and its uses. Awareness of the ethics of data collection, analysis, and use should guide all data management professionals. ## Footnote dhe
**Data Handling Ethics**
42
* describes the` technologies and business processes` that emerge as our ability to collect and analyze large and diverse data sets increases. ## Footnote bdds
**Big Data and Data Science**
43
* outlines an `approach to evaluating and improving` an organization’s data management capabilities. ## Footnote dmma
**Data Management Maturity Assessment**
44
* provide `best practices and considerations` for organizing data management teams and enabling successful data management practices. ## Footnote dmore
**Data Management Organization and Role Expectations**
45
* describes` how to plan for and successfully move through` the cultural changes that are necessary to embed effective data management practices within an organization. ## Footnote dmocm
**Data Management and Organizational Change Management**
46
`manages computer databases`. The role may include capacity planning, installation, configuration, `database design, migration`, performance monitoring, security, troubleshooting, as well as backup and data recovery. ## Footnote da
**Database administrator (DBA)**
47
is responsible for `maintaining an organization's computer networks`, including hardware and software. They ensure that networks are secure, efficient, and reliable. ## Footnote na
**network administrator**
48
are data governance employees who `collect and maintain data for the organizations` they work for while also protecting their data assets. ## Footnote ds
**Data stewards**
49
is a leader who uses data to `help a company make strategic decisions`. They are responsible for integrating data from various sources to create a unified view. ## Footnote ds
**data strategist**
50
is a `senior executive` who `manages a company's data strategy and use`. They are responsible for ensuring that data is used effectively to support business decisions. ## Footnote cdo
**Chief Data Officer (CDO)**
51
* Ensure that data is `accurate` and of `good quality` ## Footnote dq
**Data quality**
52
* `Protect data` from unauthorized access, theft, or corruption ## Footnote ds
**Data security**
53
* `Manage data governance strategies`, practices, and requirements ## Footnote dg
**Data governance**
54
* Lead the `development of a data strategy` that aligns with business objectives ## Footnote ds
**Data strategy**
55
* `Implement data analytics` into business processes ## Footnote da
**Data analytics**
56
* `Promote data literacy` and a data-driven culture ## Footnote dl
**Data literacy**
57
* `Ensure compliance` with data protection and privacy regulations ## Footnote rc
**Regulatory compliance**
58
**`Some examples of basic metadata are:`**
* author * date created * date modified * file size.
59
is also used for `unstructured data` such as images, video, web pages, spreadsheets, etc. Web pages often include metadata in the form of meta tags. ## Footnote m
**Metadata**
60
: `Identify key business goals` that data can support and prioritize data needs aligned with strategic initiatives. ## Footnote dbo
**Define Business Objectives**
61
: Conduct a `comprehensive data audit` to identify all data sources, their formats, locations, quality, and usage across the organization. ## Footnote dim
**Data Inventory and Mapping**
62
: Assess `data accuracy, completeness`, consistency, and relevance to `identify areas for improvement`. ## Footnote dqa
**Data Quality Analysis**
63
: Identify `key stakeholders`, their data requirements, and establish communication channels. ## Footnote se
**Stakeholder Engagement**
64
: `Develop clear guidelines` for data ownership, access control, data quality standards, retention policies, and p`rivacy compliance`. ## Footnote edgp
**Establish Data Governance Policies**
65
: `Assign data stewards, data owners, and data custodians` with defined accountability for data management. ## Footnote dgrr
**Data Governance Roles and Responsibilities**
66
: Implement processes to `monitor and improve data quality` through data cleansing, validation, and standardization. ## Footnote dqmp
**Data Quality Management Plan**
67
: Determine `which data sources are critical` for integration and prioritize based on business needs. ## Footnote dss
**Data Source Selection**
68
: `Map data elements` from different sources to a unified schema and transform data to ensure consistency. ## Footnote dmt
**Data Mapping and Transformation**
69
: `Select appropriate data integration tools` to extract, transform, and load (ETL) data from disparate sources. ## Footnote dit
**Data Integration Tools**
70
: Choose` appropriate data storage architecture` (relational, dimensional, cloud-based) to facilitate analysis and reporting. ## Footnote dw/ld
**Data Warehouse/Lake Design**
71
: Implement `robust data security controls` including encryption, access controls, and data masking to protect sensitive information. ## Footnote dsm
**Data Security Measures**
72
: Establish a `reliable data backup and disaster recovery plan` to mitigate data loss risks. ## Footnote dbrs
**Data Backup and Recovery Strategy**
73
: Choose `appropriate BI tools` to `visualize` and `analyze data` for decision-making. ## Footnote bits
**Business Intelligence (BI) Tool Selection**
74
: `Create customized dashboards` and reports aligned with key business metrics to provide actionable insights. ## Footnote dd
**Dashboard Development**
75
: `Develop data models` to enable efficient querying and `analysis of data` across different dimensions. ## Footnote dma
**Data Modeling and Analysis**
76
: `Regularly monitor data quality` metrics to identify and address data quality issues proactively. ## Footnote dqm
**Data Quality Monitoring**
77
: `Track key performance indicators` (KPIs) related to data management to assess the effectiveness of implemented strategies. ## Footnote pe
**Performance Evaluation**
78
: `Review and update the data management strategy` as business needs evolve and new technologies emerge. ## Footnote ac
**Adapting to Change**
79
: Ensure `strong support from leadership` and involve key stakeholders throughout the `implementation process.` ## Footnote oa
**Organizational Alignment**
80
: C`ommunicate changes effectively` and provide training to users to facilitate `adoption of new data management practices`. ## Footnote cm
**Change Management**
81
: Adhere to `relevant data privacy regulations` (e.g., GDPR, CCPA) when managing sensitive data. ## Footnote cr
**Compliance Requirements**
82
`describe the purpose the Knowledge Area` and the fundamental principles that guide performance of activities within each Knowledge Area. ## Footnote g
**Goals**
83
are the `actions and tasks` required to meet the goals of the Knowledge Area. Some activities are described in terms of sub-activities, tasks, and steps. ## Footnote a
**Activities**
84
* set the `strategic and tactical course` for meeting data management goals. it s occur on a `recurring basis`. ## Footnote pa
**(P)Planning Activities**
85
* are `organized around the system development lifecycle (SDLC)` (analysis, design, build, test, preparation, and deployment). ## Footnote da
**(D)Development Activities**
86
* `ensure the ongoing quality of data and the integrity`, reliability, and security of systems through **which** data` is accessed and used`. ## Footnote ca
**(C) Control Activities**
87
* support the` use, maintenance`, and enhancement of systems and processes through which data is accessed and used. ## Footnote oa
**(O)Operational Activities**
88
are the tangible things that each Knowledge Area requires to` initiate its activities`. Many activities require the same inputs. For example, many require knowledge of the Business Strategy as input. ## Footnote i
**Inputs**
89
are the `outputs of the activities` within the Knowledge Area, the tangible things that each function is responsible for producing. Deliverables may be ends in themselves or inputs into other activities. Several primary deliverables are created by multiple functions. ## Footnote d
**Deliverables**
90
`describe how individuals and teams contribute to activities within the Knowledge Area`. Roles are described conceptually, with a focus on groups of roles required in most organizations. Roles for individuals are defined in terms of skills and qualification requirements. Skills Framework for the Information Age (SFIA) was used to help align role titles. Many roles will be cross-functional. ## Footnote rr
**Roles and Responsibilities**
91
are the people responsible for `providing` or enabling `access to inputs` for the activities. ## Footnote s
**Suppliers**
92
those that `directly benefit` from the primary deliverables created by the data management activities. ## Footnote c
**Consumers**
93
are the `people that perform`, manage the performance of, or `approve the activities` in the Knowledge Area. ## Footnote p
**Participants**
94
are the `applications and other technologies` that enable the goals of the Knowledge Area. ## Footnote t
**Tools**
95
are the `methods and procedures` used to perform activities and produce deliverables within a Knowledge Area. Techniques include common conventions, best practice recommendations, standards and protocols, and, where applicable, emerging alternative approaches. ## Footnote t
**Techniques**
96
are `standards for measurement` or evaluation of performance, progress, quality, efficiency, or other effect. The metrics sections identify measurable facets of the work that is done within each Knowledge Area. Metrics may also measure more abstract characteristics, like improvement or value ## Footnote m
**Metrics**
97
is the `process of discovering, analyzing, and scoping data requirements`, and then representing and communicating these data requirements in a precise form called the data model. Data modeling is a critical component of data management. ## Footnote dm
**Data modeling**
98
is answering the question of “how” * How the data will be gathered * How the data will analysed * How the data requirements will be grouped depending on their subset * After that processes makakabuo na ng data model by communicating the data requirements. ## Footnote dm
**Data modeling**
99
are critical to effective management of data. They: Provide a common vocabulary around data Capture and document explicit knowledge about an organization’s data and systems Serve as a primary communications tool during projects Provide the starting point for customization, integration, or even replacement of an application ## Footnote dm
**Data models**
100
**Goals and Principles**
Confirming and documenting understanding of different perspectives facilitates: **Formalization**: A data model documents a concise `definition of data structures and relationships`. It enables assessment of how data is affected by implemented business rules, for current as-is states or desired target states. **Scope definition**: A data model can help `explain the boundaries for data context and implementation` of purchased application packages, projects, initiatives, or existing systems. **Knowledge retention/documentation**: A data model can `preserve corporate memory` regarding a system or project by capturing knowledge in an explicit form. It serves as documentation for future projects to use as the as-is version.
101
is most frequently performed in the context of systems development and maintenance efforts, known as the `system development lifecycle (SDLC)`. ## Footnote dm
**Data modeling**
102
is a `representation of something that exists or a pattern` for something to be made. A model can contain one or more diagrams. ## Footnote m
**model**
103
describes an `organization’s data as the organization understands it`, or as the organization wants it to be. A data model contains a set of symbols with text labels that attempts visually to represent data requirements as communicated to the data modeler, for a specific set of data that can range in size from small, for a project, to large, for an organization. ## Footnote dm
**Data model**
104
: Data used to `classify and assign types` to things. For example, customers classified by market categories or business sectors; products classified by color, model, size, etc.; orders classified by whether they are open or closed. ## Footnote ci
**Category information**
105
: Basic profiles of resources needed conduct operational processes such as Product, Customer, Supplier, Facility, Organization, and Account. ## Footnote ri
**Resource information**
106
: Data created while `operational processes` are in progress. Examples include Customer Orders, Supplier Invoices, Cash Withdrawal, and Business Meetings. ## Footnote bei
**Business event information**
107
: is often produced through `point-of-sale systems`(either in stores or online). ## Footnote dti
**Detail transaction information**
108
* is a thing about which an `organization collects information`. * sometimes referred to as the `nouns of an organization`. * can be thought of as the answer to a fundamental question – `who, what, when, where, why, or how` – or to a combination of these questions. ## Footnote e
**entity**
109
are the occurrences or values of a particular entity ## Footnote ei
**Entity instances**
110
**Entity -/ type, instance**
Entity – Jane, Employee Entity type – Employee Entity instance – Jane Entity – Raine, Lecturer Entity type – Lecturer Entity instance – Raine
111
In _________ the term **`relationship`** is often used, _________________ the term **`navigation path`** is often used, and in _____________ terms such as **`edge` or `link` **are used, for example._______ can also vary **based on level of detail**. A relationship at the conceptual and logical levels is called a relationship, but a relationship at the physical level may be called by other names, such as constraint or reference, depending on the database technology. ## Footnote rs ds ns ra
**relational schemes dimensional schemes NoSQL schemes Relationship aliases**
112
**Relationships between two entities** ## Footnote c dr
**Cardinality** is represented by the symbols that appear on both ends of a relationship line. **Data rules** are specified and enforced through cardinality. * Without cardinality, the most one can say about a relationship is that two entities are connected in some way.
113
The `number of entities` in a relationship is the __________________ of the relationship. The most common are unary, binary, and ternary relationships
**‘arity’**
114
relationship involves only `one entity`. A `one-to-many` recursive relationship describes a **hierarchy**, whereas a `many-to-many` relationship describes a **network or graph**. In a hierarchy, an entity instance has at most one parent (or higher-level entity). In relational modeling, child entities are on the many side of the relationship, with parent entities on the one side of the relationship. Ina network, an entity instance can have more than one parent. ## Footnote u
**unary (also known as a recursive or self-referencing)**
115
An arity of two is also known as _____________. A binary relationship, the most common on a traditional data model diagram, involves two entities. ## Footnote b
**binary**
116
An arity of three, known as ________, is a relationship that includes three entities. An example in fact-based modeling (object-role notation) appears in Figure 35. Here Student can register for a particular Course in a given Semester. ## Footnote t
**ternary**
117
* is used in `physical and sometimes logical relational data` modelling schemes to represent a relationship. * may be `created implicitly when a relationship is defined between two entities`, depending on the database technology or data modeling tool, and whether the two entities involved have mutual dependencies. ## Footnote fk
**foreign key**
118
(also called a key) is a set of one or more attributes that `uniquely defines an instance of an entity`. This section defines types of keys by construction (simple, compound, composite, surrogate) and function (candidate, primary, alternate). ## Footnote i
**identifier**
119
is one attribute that uniquely identifies an entity instance. * Ex. Universal Product Codes (UPCs) and Vehicle Identification Numbers(VINs). ## Footnote sk
**simple key**
120
* is also an example of a `simple key`. * is a `unique identifier for a table`. Often a counter and always system-generated without intelligence, a surrogate key is an integer whose meaning is unrelated to its face value. ## Footnote sk
**surrogate key**
121
is a set of `two or more` attributes that together uniquely identify an entity instance. Ex. Phone number (area code + exchange + local number). ## Footnote ck
**compound key**
122
contains `one compound key` and at least one other simple or compound key or non-key attribute. ## Footnote ck
**composite key**
123
A is any `set of attributes` that uniquely identify an entity instance. ## Footnote sk
**super key**
124
A is a `minimal set of one or more attributes` (i.e., a simple or compound key) that identifies the entity instance to which it belongs. ## Footnote ck
**candidate key**
125
is one or more `attributes that a business professional` would use to retrieve a single entity instance. ## Footnote bk
**business key**
126
is the candidate key that is chosen to be the `unique identifier for an entity`. ## Footnote pk
**primary key**
127
can still be `used to find specific entity instances`. Often the primary key is a surrogate key and the ____________________ are business keys. ## Footnote ak
**alternate key**
128
is one where the primary key contains `only attentityributes that belong to that entity.` ## Footnote ie
**independent entity**
129
is one where the primary key contains `at least one attribute from another entity`. ## Footnote de
**dependent entity**
130
: Domains that specify the `standard types of data` one can have in an attribute assigned to that domain. For example, Integer, Character(30), and Date are all data type domains. ## Footnote dt
**Data Type**
131
: Domains that` use patterns` including templates and masks, such as are found in postal codes and phone numbers, and character limitations (alphanumeric only, alphanumeric with certain special characters allowed, etc.) to define valid values. ## Footnote df
**Data Format**
132
: Domains that contain a `finite set of values`. These are familiar to many people from functionality like dropdown lists. * For example, the list domain for **OrderStatusCode** can restrict values to only {Open, Shipped, Closed, Returned}. ## Footnote l
**List**
133
: Domains that allow `all values of the same data type` that are between one or more minimum and/or maximum values. Some ranges can be` open-ended`. * For example, **OrderDeliveryDate** must be between OrderDate and three months in the future. ## Footnote r
**Range**
134
: Domains `defined by the rules` that values must comply with in order to be valid. These include rules comparing values to calculated values or other attribute values in a relation or set. * For example, **ItemPrice** must be greater than ItemCost. ## Footnote rb
**Rule-based**
135
The use of schemes `depends in part on the database being built`, as some are suited to particular technologies ## Footnote dms
**Data Model Schemes**
136
**CDM, LDM, PDM**
* In a **CDM**, you can define data items and entity attributes. In a **LDM**, you can only define entity attributes. * In the **CDM**, the foreign attribute migration does not occur until you generate a **LDM** or **PDM**. * In the **LDM**, the foreign attribute migrates immediately. * Conceptual, logical, physical data models(**PDM**)
137
First articulated by **Dr. Edward Codd** in 1970, _______________ provides a `systematic way to organize data` so that they reflected their meaning(Codd, 1970). This approach had the additional effect of reducing redundancy in data storage ## Footnote rt
**relational theory**
138
The concept of _________ started from a joint research project conducted by **General Mills** and **Dartmouth College** in the 1960’s. 33 In dimensional models, data is structured to `optimize the query and analysis of large amounts of data`. In contrast, operational systems that support transaction processing are optimized for fast processing of individual transactions. ## Footnote dm
**dimensional modeling**
139
**The three main types of change are sometimes known by ORC.**
* **Overwrite (Type 1)**: The `new value overwrites the old value` in place. * **New Row (Type 2)**: The n`ew values are written in a new row`, and the old row is marked as not current. * **New Column (Type 3)**: Multiple instances of a `value are listed in columns` on the same row, and a new value means writing the values in the series one spot down to make space at the front for the new value. The last value is discarded.
140
is the term given to `normalizing the flat, single-table, dimensional structure` in a star schema into the respective component hierarchical or network structures.
**Snowflaking**
141
stands for the `meaning or description of a single row of data` in a fact table; this is the most detail any row will have. ## Footnote g
**grain**
142
are `built with the entire organization in minD` instead of just a particular project; this allows these dimensions to be shared across dimensional models, due to containing consistent terminology and values.
**Conformed dimensions**
143
use` standardized definitions of termS` across individual marts. Different business users `may use the same term in different ways`. ‘Customer additions’ may be different from ‘gross additions’ or ‘adjusted additions.’ ## Footnote cf
**Conformed facts**
144
* is a `graphical language` for modeling software. * has a `variety of notations` of which one (the class model) concerns databases. * class model specifies` classes`(entity types) and their` relationship` types (Blaha, 2013). ## Footnote uml
**Unified Modeling Language (UML)**
145
has Operations or Methods (also called its “behavior”). Class behavior is only loosely connected to business logic because it still needs to be sequenced and timed. In ER terms, the table has stored procedures/triggers. Class Operations can be: ## Footnote c
**class**
146
, a `family of conceptual modeling languages`, originated in the late 1970s. Fact-based languages view the world in terms of objects, the facts that relate or characterize those objects, and each role that each object plays in each fact. ## Footnote fbm
**Fact-Based Modeling**
147
`do not use attributes`, reducing the need for intuitive or expert judgment by expressing the exact relationships between objects (both entities and values). ## Footnote fbm
**Fact-based models**
148
is a `model-driven engineering approach` that starts with typical examples of required information or queries presented in any external formulation familiar to users, and then verbalizes these examples at the conceptual level, in terms of simple facts expressed in a controlled natural language. ## Footnote orm
**Object-Role Modeling (ORM)**
149
is `similar in notation and approach to ORM`. The numbers in Figure 43 are references to verbalizations of facts. ## Footnote fcom
**Fully Communication Oriented Modeling (FCO-IM)**
150
are used when data values must be associated in `chronological order and with specific time values`. ## Footnote tbp
**Time-based patterns**
151
is a `detail-oriented, time-based, and uniquely linked set of normalized tables` that support one or more functional areas of business. Itis a hybrid approach, encompassing the best of breed between third normal form and star schema. Data Vaults are designed specifically to meet the needs of enterprise data warehouses. ## Footnote dv
**Data Vault**
152
is a `technique suited for information that changes overtime in both structure and content`. It provides graphical notation used for conceptual modeling similar to traditional data modeling, with extensions for working with temporal data. ## Footnote am
**Anchor Modeling**
153
is a name for the category of `databases` built on `non-relational technology`. ## Footnote n
**NoSQL**
154
Instead of taking a business subject and breaking it up into multiple relational structures, document databases frequently store the business subject in one structure called a ___________. ## Footnote d
**document**
155
databases allow an `application to store its data in only two columns` (‘key’ and ‘value’), with the feature of storing both simple (e.g., dates, numbers, codes) and complex information (unformatted text, video, music, documents, photos) stored within the ‘value’ column. ## Footnote kv
**Key-value**
156
Out of the four types of NoSQL databases, _________________ is `closest to the RDBMS`. Both have a similar way of looking at data as rows and values. ## Footnote co
**column-oriented**
157
A____________ database is designed `for data whose relations are well represented` as a set of nodes with an undetermined number of connections between these nodes. ## Footnote g
**graph**
158
: This `embodies the ‘real world’ view` of the enterprise being modeled in the database. It represents the current `‘best model’ or ‘way of doing business’ `for the enterprise. ## Footnote c
**Conceptual**
159
: The various users of the database management system `operate on subsets` of the total enterprise model that are relevant to their particular needs. These subsets are represented as `‘external schemas’.` ## Footnote e
**External**
160
: The` ‘machine view’ of the data` is described by the internal schema. This schema describes the stored representation of the enterprise’s information ## Footnote i
**Internal**
161
This section provides an overview of conceptual, logical, and physical data modeling. ## Footnote dm
**Data Model**
162
A ____________ `captures the high-level data requirements` as a collection of related concepts. It contains only the basic and critical business entities within a given realm and function, with a description of each entity and the relationships between entities. ## Footnote cdm
**conceptual data model**
163
A _____________________ is a `detailed representation of data requirements,` usually in support of a specific usage context, such as application requirements. Logical data models are still independent of any technology or specific implementation constraints. * often begins as an `extension of a conceptual data model` ## Footnote ldm
**logical data model**
164
A is in many cases a `fully-attributed perspective` of the dimensional conceptual data model, as illustrated in Figure 49. ## Footnote dldm
**dimensional logical data model**
165
A ___________ represents a `detailed technical solution`, often using the logical data model as a starting point and then adapted to work within a set of hardware, software, and network tools. Physical data models are built for a particular technology. ## Footnote pdm
**physical data model (PDM)**
166
A variant of a physical scheme is a , `used for data in motion between systems`. This model describes the `structure of data` being passed between systems as packets or messages. When sending data through web services, an Enterprise Service Bus (ESB), or through Enterprise Application Integration (EAI), the canonical model describes what data structure the sending service and any receiving services should use. ## Footnote cm
**Canonical Model**
167
is a `virtual table`. provide a means to look at data from one or many tables that contain or reference the actual attributes. A standard view runs SQL to retrieve data at the point when an attribute in the view is requested. An instantiated (often called ‘materialized’) view runs at a predetermined time. Views are used to simplify queries, control data access, and rename columns, without the redundancy and loss of referential integrity due to denormalization. ## Footnote v
**view**
168
refers to the `process of splitting a table`. It is performed to facilitate archiving and to improve retrieval performance ## Footnote p
**Partitioning**
169
**Vertically vs. Horizontally split**
* **Vertically split**: To reduce query sets, `create subset tables that contain subsets of columns`. For example, split a customer table in two based on whether the fields are mostly static or mostly volatile (to improve load / index performance), or based on whether the fields are commonly or uncommonly included in queries (to improve table scan performance). * **Horizontally split**: To reduce query sets, `create subset tables using the value of a column as the differentiator`. For example, create regional customer tables that contain only customers in a specific region.
170
* is the deliberate `transformation of normalized logical data model entities into physical tables` with redundant or duplicate data structures. There are several reasons to denormalize data. * can also be used to` enforce user security` by segregating data into multiple views or copies of tables according to access needs. This process does introduce a risk of data errors due to duplication. ## Footnote d
**Denormalization**
171
In dimensional data modeling, is called **collapsing** or **combining**. If each dimension is `collapsed into a single structure`, the resulting data model is called a **Star Schema** (see Figure 51). If the `dimensions are not collapsed`, the resulting data model is called a **Snowflake** (See Figure 49). ## Footnote d
**denormalization**
172
is the `process of applying rules in order to organize business complexity into stable data structures`. The basic goal of normalization is to keep each attribute in only one place to eliminate redundancy and the inconsistencies that can result from redundancy. ## Footnote n
**Normalization**
173
: Ensures `each entity has a valid primary key`, and `every attribute depends on the primary key`; removes repeating groups, and ensures each attribute is atomic(not multi-valued). 1NF includes the resolution of many-to-many relationships with an additional entity often called an associative entity. ## Footnote fnf
**First normal form (1NF)**
174
: Ensures `each entity has the minimal primary key` and that every attribute `depends on the complete primary key.` ## Footnote snf
**Second normal form (2NF)**
175
: Ensures `each entity has no hidden primary keys` and that each attribute `depends on no attributes outside the key` (“the key, the whole key and nothing but the key”). ## Footnote tnf
**Third normal form (3NF)**
176
: `Resolves overlapping composite candidate keys`. A candidate key is either a primary or an alternate key. ‘Composite’ means more than one (i.e., two or more attributes in an entity’s primary or alternate keys), and ‘overlapping’ means there are hidden business rules between the keys. ## Footnote b/cnf
**Boyce / Codd normal form (BCNF)**
177
: `Resolves all many-to-many-to-many relationships` (and beyond) in pairs until they cannot bebroken down into any smaller pieces. ## Footnote fnf
**Fourth normal form (4NF)**
178
: `Resolves inter-entity dependencies` into basic pairs, and all join dependencies use parts of primary keys. ## Footnote fnf
**Fifth normal form (5NF)**
179
is the `removal of details` in such a way as to broaden applicability to a wide class of situations while preserving the important properties and essential nature from concepts or subjects. includes `generalization and specialization`. * **Generalization** groups the `common attributes and relationships` of entities into super type entities, while specialization separates distinguishing attributes within an entity into subtype entities. * This `specialization` is usually b`ased on attribute values` within an entity instance. ## Footnote a
**Abstraction**
180
is the concept of `exposing only the required essential characteristics and behavior` with respect to a context. ## Footnote a
**Abstraction**