Data Modeling Flashcards
_________ is the process of discovering, analyzing, and scoping data
requirements, and then representing and communicating these data requirements in a precise form. This process is iterative and may include a conceptual, logical, and physical.
Data Modeling
What are the goals of data modeling?
To confirm and document an understanding of different perspectives, which leads to applications that more closely align with current and future business requirements, and creates a foundation to successfully complete broad-scoped initiatives such as master data management and data governance programs.
What are the business drivers for data modeling?
Provide a common vocabulary around data
Capture and document explicit knowledge about an organization’s data and systems
Serve as a primary communications tool during projects
Provide the starting point for customization, integration, or even replacement of an application
Data modeling is most frequently performed in the context of systems __________ and __________ efforts,
known as the system development lifecycle (SDLC).
development and maintenance
What is the main purpose of a data model?
To describe a business
Within data modeling, a(n) ______ is a thing about which an organization collects information. They are also sometimes referred
to as the nouns of an organization.
entity
A ___________ captures the high-level interactions between conceptual entities, the detailed interactions between logical entities, and the constraints between physical entities.
relationship
An __________ is a property that identifies, describes, or measures an entity. The physical correspondent of an ________ in an entity is a column, field, tag, or node in a table, view, document, graph, or file.
attribute
A _________ data model is a detailed representation of data requirements, usually in support of a specific usage context, such as application requirements. These types of data models are still independent of any technology or specific implementation constraints.
Logical
“Detailed, may lead to a physical design”
Logical Model captures detailed data requirements
— Analyze information requirements and existing documentation
— Add associative entities
— Add attributes
— Assign domains
— Assign keys
A _________ data model represents a detailed technical solution, often using the logical data model as a starting point and then adapted to work within a set of hardware, software, and network tools. These types of data models are built for a particular technology.
physical
“Optimized for specific technical environments”
Physical Model outlines how data will be stored in enterprise systems
— Resolve logical abstractions
— Add attributes
— Add reference data objects
— Assign surrogate keys
— Denormalize for performance
— Index for performance
— Partition for performance
— Create views
A __________ data model captures the high-level data requirements as a collection of related concepts. It contains only the basic and critical business entities within a given realm and function, with a description of each entity and the relationships between entities.
Conceptual
“Agree on basic concepts and rules”
Conceptual Model may start off looking like a glossary
— Conceptual Data Model components
— Concepts = entities
— Activities = relationships
— Incorporate enterprise terminology
— Build by connecting nouns (entities) and verbs (relationships)
____________ data model is a holistic, business-level, implementation-independent conceptual or logical data model providing a common consistent view of data across the enterprise. It is common to use the term to mean a high-level, simplified data model, but that is a question of abstraction for presentation.
Enterprise
“Big Picture”
A(n) _________ entity is one where the primary key contains only attributes that belong to that entity. Conversely, a________ entity is one where the primary key contains at least one attribute from another entity.
independent, dependent
In data modeling, a ________ is the complete set of possible values that an attribute can be assigned. It may be articulated in different ways (see points at the end of this section).
domain
examples
— Data type (e.g. Character(30) )
— Data format (e.g. phone number template)
— List
— Range
— Rule-based (e.g. ItemPrice > ItemCost)
This type of domain allows all values that are between a minimum and maximum value. For example, Order Delivery Date must be between Today’s Date and three months in the future.
Range
A(n) __________ relationship is one where the primary key of the parent (the entity on the one side of the relationship) is migrated as a foreign key to the child’s primary key.
identifying
“The primary key of the parent becomes part of the primary key of the child.”
Engineering Notation is a solid line.
In _____________ relationships, the primary key of the parent is migrated as a non-primary foreign key attribute to the child.
non-identifying
“The child does not depend on the parent to get its uniqueness”
Engineering Notation is a dotted line.
Which normal form ensures each entity has a valid primary key, and every attribute depends on the primary key; removes repeating groups, and ensures each attribute is atomic (not multi-valued)?
This form includes the resolution of many-to-many relationships with an additional entity often called an associative entity.
First normal form (1NF)
Which normal form ensures each entity has the minimal amount of attributes as the primary key and that every attribute depends on the complete primary key?
Second normal form (2NF)
Which normal form ensures each entity has no hidden primary keys and that each attribute depends on no attributes outside the key (“the key, the whole key and nothing but the key”).
Third normal form (3NF)
Reaching this form equates to the model being normalized.
In Chris Bradley Data Modeling video, the relationship between many to many entities represents what?
A real business concept.
An example was if you had many assets at many locations, those could be called “installations” with other properties \ attributes such as installation date etc. Another example would be many customers who have\get many products. This would be “orders”
_____________ is the removal of details in such a way as to broaden applicability to a wide class of situations while preserving the important properties and essential nature from concepts or subjects.
abstraction
An example of abstraction is the Party/Role structure, which can be used to capture how people and organizations play certain roles (e.g.,
employee and customer). Not all modelers or developers are comfortable with, or have the ability to work with abstraction. The modeler needs to weigh the cost of developing and maintaining an abstract structure versus the amount of rework required if the unabstracted structure needs to be modified in the future (Giles 2011).
A unary (also known as a recursive or self-referencing) relationship involves only one entity. A one-to-many recursive relationship describes a _______ , whereas a many-to-many relationship describes a ________ .
Which type has at most one parent (or higher-level entity).
hierarchy, network or graph
A _______ is a generic entity type that has a relationship with one or more subtypes. A _______ is a sub-grouping of the entities in an entity type that is meaningful to the organization and that shares common attributes or relationships distinct from other subgroups.
supertype, subtype
Example would be a person table that has first last name. The person table would be the super type. Example subtypes would be doctor, lawyer where we then gather and store additional info pertinent to those types.
A ________ key is the candidate key that is chosen to be the unique identifier for an entity. Even though an entity may contain more than one candidate key, only one candidate key can serve as the primary key for an entity. A(n) _______ key is a candidate key that although unique, was not chosen as the primary key.
primary, alternate