Chapter 4 Database Design Flashcards
What is the purpose of the Entity-Relationship Model (ERM) in database design?
ERM is a high-level representation of data requirements, ignoring implementation specifics.
In an airline reservation system, Passenger and Booking are $.
Passenger and Booking are entities.
What would 1 & 2 represent?
- Holds is a relationship between Passenger and Booking.
- PassengerNumber, PassengerName, BookingCode, BookingCost are attributes.
What do entities represent in an Entity-Relationship Model?
Entities arenβt the rows themselves, they define what kind of information those rows will hold.
What are relationships in the context of an Entity-Relationship Model?
Statements about two entities; can be reflexive (entity relates to itself).
What are attributes in an Entity-Relationship Model?
Descriptive properties of an entity.
In SQL implementation, what do entities typically become?
Entities typically become tables.
In SQL implementation, what do relationships become?
Relationships become foreign keys.
In SQL implementation, what do attributes become?
Attributes become columns.
What is an ER Diagram & itβs representations?
A visual representation showing:
- entities as rectangles
- relationships as lines
- attributes as additional text.
Consider the ER Diagram
what would the entity-relationship-entity
be read as?
What is a glossary document?
Glossary or data dictionary
Documents names, synonyms, and descriptions of entities, relationships, and attributes.
Define βEntity Typeβ in the context of ER modeling.
Set of all instances
(e.g., all employees).
Define βRelationship Typeβ in the context of ER modeling.
Set of related pairs
(e.g., Employee-Manages-Department).
Define Attribute Type in the context of ER modeling.
Set of values
(e.g., all employee salaries).
What is a supertype entity?
A general category that encompasses subtypes.
What is a subtype entity?
A specific subset of a supertype.
What does βIsAβ relationship refer to?
The relationship between a supertype and its subtypes.
A supertype may have $ subtypes.
Multiple
In ER modeling, what notation is used to represent cardinality?
Crowβs Foot Notation.
What does cardinality refer to in Entity-Relationship Modeling?
The maximum and minimum numbers of instances of one entity that can relate to a single instance of another entity.
What are the two maxima in a relationship?
One for each of the related entities, usually specified as either βoneβ or βmanyβ.
What are the typical options for relationship minima?
Zero (optional) or one (required).
What is a partition in the context of supertype and subtype entities?
A group of mutually exclusive subtypes within a supertype.
Why canβt an entity instance be in two subtypes of the same partition?
This is a fundamental rule in entity-relationship modeling and database design.
- Partitions divide an entity type into distinct, non-overlapping groups.
- Subtypes further categorize entities within a partition.
- An entity instance can only exist; in one partition and one subtype within that partition. Example
This ensures data integrity and avoids ambiguity.
How can an entity instance; be in two subtypes of different partitions?
- An entity instance cannot be in two subtypes of different partitions in a database model because, by definition, partitions are designed to be mutually exclusive.
- An entity can only belong to one subtype within a specific partition; if it belongs to a different partition, it must be in a different subtype within that partition.
What are the main steps in the database design process?
Analysis, Logical Design, Physical Design.
What is the goal of the analysis phase in database design?
Often intertwined with conceptual logic
- Gather data requirements
- To develop the entity-relationship model.
What does the logical design phase involve?
Converting ERM to tables, foreign keys, and columns.
What is the main purpose of the discovery phase in database design?
Identifying entities, relationships, and attributes through interviews with database users and managers.
What types of sources can be used to identify data requirements?
Interviews and written documents.
How should entities be named, when used in terms of conventions?
Singular nouns
(e.g., βEmployeeβ).
What format should relationships be named in?
Entity-Verb-Entity
(e.g., Division-Contains-Department).
What is a unique attribute?
An attribute where each entity instance has a unique value across the entire database.
What is a singular attribute?
Each entity instance has at most one attribute value.
What is the relationship maximum for the Employee-Has-Telephone
relationship if each employee can have at most three telephone numbers?
1-3
- β1-3β means an employee must have at least one (β1β) but no more than three (β3β) phone numbers.
- It shows the allowed range, with any number within that range being acceptable.
What is the relationship minimum for the Customer-Has-Identification
relationship if two forms of identification may be required?
1-2
- 1 means at least one form of identification is always required.
- 2 means that in some cases, a second form of identification may also be necessary.
What is the main purpose of documenting cardinality in ER diagrams?
To reflect business rules.
What are the steps in the discovery process?
- The discovery process in design is not a rigid, step-by-step procedure.
- Itβs more of an iterative process where designers often move back and forth between different stages as they gather new information and insights.
In database design, when are standard attribute types determined?
ER diagram development.
Identify nouns as _______ in database design.
entities.
Recognize verbs as _______ in database design.
relationships.
Specify nouns as _______ in database design.
attributes.
What are the characteristics of good primary keys?
- Unique
- Not NULL
- Stable
- Simple
- Meaningless
What is the ideal primary key structure?
Single-column primary keys are ideal; composite primary keys can be used if unavailable.
primary key that uses multiple columns in a table to create a unique value for each row
What is the primary key of a subtype table?
Matches the primary key of the supertype.
What do weak entities become in a database?
Weak tables.
How should foreign keys behave when changes are made to the primary key in the parent table?
Think of Customers
and Orders
tables. An order needs a customer, so Orders
uses a foreign key to link to Customers
.
- Cascade on delete: Deleting a customer also deletes their orders.
- Restrict on insert: You canβt create an order for a non-existent customer.
When implementing many-one relationships, where is the foreign key placed?
In the table on the many side of the relationship.
In a one-one relationship, where is the foreign key typically placed?
In the table with fewer rows.
How is a many-many relationship implemented?
Using a new weak table containing two foreign keys.
What is the primary key of the new table in a many-many relationship?
Composite of the two foreign keys.
What do plural attributes become in a database?
New weak tables
- plural attributes are transformed into new weak tables to effectively model many-to-many relationships and maintain data integrity within a database.
What is the SQL data type for a Code attribute?
CHAR(3).
What are the types of attribute cardinality?
- Unique
- Required
- Optional
What SQL keyword enforces unique attributes?
UNIQUE.
What are the criteria for a table to be in First Normal Form (1NF)?
- Every cell contains exactly one value
- The table has a primary key
- No duplicate rows are allowed
What is the significance of functional dependence?
Reflects business rules.
What does Boyce-Codd Normal Form (BCNF) eliminate?
Redundancy arising from functional dependence.
What is a candidate key?
A simple or composite column that is unique and minimal.
What does normalization aim to achieve in database design?
Elimination of redundancy from database tables.
What is denormalization used for?
- Enhancing query performance by merging tables.
- Accelerates all
JOIN
queries.
What are the characteristics of denormalization?
- Reduces complexity
- Encourages redundancy
- Not common in frequently updated databases
What should be done when a column A depends on a non-unique column B?
- Remove column A from the original table and create a new table for A and B.
- This reduces redundancy and improves data integrity by ensuring updates are made in one place.
What is the formal definition of BCNF?
A table is in BCNF if, whenever column A depends on column B, then B is unique.
What is the relationship between redundancy and functional dependence?
- Functional Dependency: One attribute determines the value of another.
- Redundancy: Storing the same information multiple times, often due to functional dependencies.
What is normalization?
The process of eliminating redundancy from database tables.
What is the goal of normalization?
To refine the design by transforming tables into higher normal forms.
What is Boyce-Codd normal form?
A higher normal form that aims to eliminate redundancy.
What happens to a table in first normal form during normalization?
A table in first normal form might be split into two (or more) organized tables during normalization.
What are unique columns in a table?
Define dependency in the context of database columns.
Column A depends on column B when; each B value relates to at most one A value.
What must be true for a table to be in Boyce-Codd normal form?
If column A depends on column B, then B must be unique.
What is the first step in normalizing a table to Boyce-Codd normal form?
List all unique columns in the table.
What should be done if column A depends on a non-unique column B?
Column A should be removed from the original table and placed in a new table with B.
What does denormalization refer to?
The practice of merging tables and allowing some redundancy to enhance query performance.
When is denormalization typically used?
- In reporting databases where changes are infrequent.
- This way, you can retrieve all the information you need with a single query, without the need for a join
Infrequent means something that doesnβt happen often or regularly
Why might a database designer choose to denormalize?
To reduce complexity and enhance query performance.
What is are 2 characteristics of denormalization?
- It encourages redundancy in data storage.
- Accelerates all
JOIN
queries.
What should be carefully considered before denormalizing a database?
The frequency of changes to the data.
What activities are involved in applying normal form?
Identifying dependencies, eliminating redundancy, and considering denormalization.
How do database designers assess tables for Boyce-Codd normal form?
By identifying dependencies and unique columns.
What is the relationship between normalization and performance?
- Normalization reduces redundancy
- Denormalization can enhance query performance.
What is the significance of E. F. Codd in relation to normalization?
He originally defined the relational model and the concept of normalization.
What kind of queries does denormalization help reduce?
Join queries.
What is the primary focus during the database design process?
Ensuring tables meet Boyce-Codd normal form characteristics.
What are conventions in data organization?
Conventions are rules for naming and organizing files, data sets, and other data.
Conventions help in standardizing how data is structured and accessed.
What does it mean for subtypes to be mutually exclusive?
A single entity can only belong to one subtype within a given supertype.
This ensures clarity in classification and prevents ambiguity.
What is the purpose of mutually exclusive subtypes?
To maintain data integrity and consistency.
This prevents overlap in how entities are classified within a supertype.
What is an entity type in entity-relationship modeling?
A set of unique individuals
Example: The entity type Passenger
includes all passengers in the system.
What does the relationship type represent in entity-relationship modeling?
A set of related pairs
Example: The relationship type Passenger-Holds-Booking
represents pairs of passengers and their associated bookings.
What is an attribute type?
A set of values related to an entity
Example: The attribute type BookingCode
encompasses all possible booking codes.
Define entity instance
An individual member of the entity type
Example: Muhammed Ali
is a specific entity instance representing a passenger.
What is a relationship instance?
A specific assertion about entities
Example: Muhammed Ali holds 39240
describes the relationship involving the passenger and their booking.
What is an attribute instance?
An individual value related to an attribute
Example: 39240
is a specific instance of the BookingCode
attribute.
In entity-relationship modeling, what do entities, relationships, and attribute types correlate to?
- Entities types: Tables
- Relationships types: foreign keys
- Attribute types: columns
Their instances correspond to rows, foreign key values, and column values, respectively.
What is the first step in the analysis process?
Discover entities, relationships, and attributes
What is the second step in the analysis process?
Determine cardinality
What is the third step in the analysis process?
Distinguish strong and weak entities
What is the fourth step in the analysis process?
Create supertype and subtype entities
What is the fifth step in the logical design process?
What is the sixth step in the logical design process?
What is the seventh step in the logical design process?
Implement attributes
What is the eighth step in the logical design process?
Apply normal form
What type of attribute is represented by at most one in a database?
Singular attribute
Indicates that there can be a maximum of one instance of this attribute per entity.
What type of attribute is indicated by the term βseveralβ?
Plural attribute
Indicates that there can be multiple instances of this attribute per entity.
What does it mean when an attribute is described as optional?
- The attribute does not have to be provided for every record.
- This means that information may be left blank in the database
What type of attribute must every student have according to the text?
βAll students enrolled in the school must have a unique student ID number.β
Required attribute
An attribute that must be filled out for every entity.
What is the significance of the phrase used to identify in relation to an attribute?
Unique attribute
Indicates that each instance of this attribute must be distinct across the database.
StudentNumber is a(n) $ attribute of Student
Unique
Each student is assigned a unique eight-digit number.
TelephoneNumber is a(n) $ attribute of Student.
Optional
The telephone number is optional and may be left blank.
In database design, what does the cardinality of relationships depend on?
Business rules
This determines how entities relate to one another in terms of quantity.
MajorSubjectName is a(n) _______ attribute of Student.
plural
Indicates that students can major in multiple subjects.
What are normal forms in relational databases?
A set of rules and guidelines for structuring a relational database
What is the purpose of normal forms?
To address specific types of data redundancy and potential issues
List the most common normal forms.
- 1NF (First Normal Form)
- 2NF (Second Normal Form)
- 3NF (Third Normal Form)
- BCNF (Boyce-Codd Normal Form)
What is the first step in eliminating redundancy?
Identifying Redundancy by analyzing data to spot redundant information
What is decomposition in the context of normalization?
Breaking down tables into smaller, more focused tables to reduce redundancy
What does 1NF (First Normal Form) achieve?
Eliminates repeating groups of data within a single column
What is the goal of 2NF (Second Normal Form)?
Remove redundant data that depends on only part of the primary key
2NF aims to remove transitive dependencies
What does 3NF (Third Normal Form) eliminate?
Data that depends on non-key attributes (transitive dependencies)
What distinguishes BCNF (Boyce-Codd Normal Form) from 3NF?
BCNF is a more stringent version of 3NF, further reducing redundancy
To normalize a table called Customers, you might __ ?
create a separate table that further divides
What are the benefits of normalization?
- Reduced Data Redundancy
- Improved Data Integrity
- Increased Flexibility
- Better Performance
While higher normal forms improve data integrity, they donβt always mean better performance. Why?
- More tables & joins: This can slow down queries.
- Normalization overhead: Maintaining consistency across multiple tables takes extra work.
The best approach depends on balancing performance needs with data integrity.
Why is it important to consider trade-offs when aiming for higher normal forms?
Sometimes, slight redundancy might be acceptable to improve query performance
What does IMS stand for?
Information Management System
IMS is a hierarchical database management system used primarily on IBM mainframe systems.
What type of database is IDMS?
Network database management system
IDMS stands for Integrated Database Management System.
What is a key characteristic of a hierarchical database?
Data is organized in a tree-like structure
Hierarchical databases utilize parent-child relationships.
What type of database does Oracle Database Relational refer to?
Relational database management system
Oracle Database uses SQL for data manipulation.
What is Neo4?
A graph database management system
Neo4j is designed to handle highly connected data.
What type of data structure does ObjectStore Object represent?
Object-oriented database management system
It allows for the storage of objects rather than just data.
What is MongoDB classified as?
Document database management system
MongoDB stores data in JSON-like format.
What are the primary components of a graph database?
Vertices represent entities, while edges represent relationships between them.
Define primary data structure.
The main format used to store data in a database
Examples include tables, trees, and linked lists.
What is a Binary tree an example of?
Tree data structure
A binary tree is a type of tree where each node has at most two children.
What is a linked list?
- A linear data structure where elements are stored in nodes
- Each node points to the next node in the sequence.
What is XML?
- Extensible Markup Language
- XML is used to define rules for encoding documents in a format that is both human-readable and machine-readable.
What does JSON stand for?
JavaScript Object Notation
JSON is a lightweight format for data interchange that is easy for humans to read and write.
What is a advantage of document databases?
Flexible schema
Document databases allow for unstructured and semi-structured data.
What are the advantages of fast queries in a database?
Improved performance and user experience
Fast queries reduce wait times for data retrieval.
What does productivity and simplicity refer to in database management?
- Ease of use and efficiency in database operations
- Simplified database management enhances user productivity.
A flexible schema can adapt to _?
- Evolving business requirements
- Flexible schemas are essential for accommodating changes in data needs.
What type of applications benefit from transactional databases?
Transactional applications
These applications require reliable and consistent data processing.
What visual representation difference is notable between Crowβs Foot and Chen?
Lines in Crowβs Foot, diamonds in Chen
This refers to how relationships are depicted in each model.
How does Crowβs Foot symbolize cardinality?
Uses distinctive βcrowβs footβ symbols at the ends of relationship lines
This contrasts with Chenβs method of using numbers and letters.
How does Chen represent cardinality in its diagrams?
Uses numbers and letters near the relationship diamond
This provides a different approach than Crowβs Foot.
Where are attributes placed in Crowβs Foot diagrams?
Sometimes inside the entity rectangle
This can lead to a cleaner look compared to Chen.
Where does Chen place attributes in its diagrams?
Always keeps them separate from the entity
This is a consistent feature of Chenβs model.
What does it mean that ProjectCode
has at most one code?
The ProjectCode
attribute is singular, meaning each entity instance has at most one attribute value.
That $ : Singular
What does it mean that each project code describes at most one project?
The ProjectCode
attribute is unique, meaning each attribute value describes at most one entity instance.
M: Plural: unique
A project may have no code. The ProjectCode
attribute is ?
- Optional means an entity instance may have no attribute value.
- This is the case for
ProjectCode
, soProjectCode
is optional.
May : optional
- A project may have no code.
- Is
ProjectCode
an identifying attribute of theProject
entity?
- No,
ProjectCode
is optional, not required, and therefore not an identifying attribute ofProject
. - An identifying attribute is unique, singular, and required.
The foreign key implements the ________ identifying relationship.
IsA
Foreign keys that implement identifying relationships usually have the following referential integrity actions: __________ on primary key update and delete, and __________ on foreign key insert and update.
- Cascade on primary key update and delete
- Restrict on foreign key insert and update
What is Block Storage?
- Data is stored in fixed-size blocks, typically with direct attached storage or over a Storage Area Network (SAN)
- Block storage offers low latency and high performance.
- Suitable for applications requiring fast access to data, such as databases and transactional systems.
What is File Storage?
Data is stored as files organized within a hierarchical structure of folders
Accessed over a network using protocols like NFS or SMB. Well-suited for file sharing, general-purpose storage, and applications needing file locking capabilities.
What is Object Storage?
Data is stored as objects, each containing the data, metadata, and a unique identifier
- Accessed over a network using HTTP or REST APIs.
- Highly scalable and ideal for storing large amounts of unstructured data like images, videos, and backups.
List common use cases for Block Storage.
- Databases
- Virtual machines
- Transactional applications
List common use cases for File Storage.
- File sharing
- Web servers
- Content management systems
List common use cases for Object Storage.
- Cloud storage
- Backup and archiving
- Big data analytics
- Media storage
Which storage type is widely used for general-purpose storage and file sharing?
File Storage
Used in both on-premises and cloud environments.
Which storage type is increasingly popular for cloud-native applications?
Object Storage
Due to its scalability and cost-effectiveness.
Which storage type remains essential for applications requiring high performance and low latency?
Block Storage
Particularly in enterprise and database environments.
What is Hybrid Storage?
A combination of storage types to meet diverse needs
Many organizations utilize hybrid storage solutions.
What do cloud providers offer in terms of storage services?
Various storage services, including block, file, and object storage, with different performance and cost tiers.
What does the UNIQUE
keyword enforce?
Unique attributes and relationships have maximum cardinality of one.
This ensures that no two records can have the same value for the unique attribute.
What does the NOT NULL
keyword enforce?
Required attributes and relationships have minimum cardinality of one.
This means that a value must be present in the column.
What does the PRIMARY KEY
keyword specify?
A primary key column must be unique and required
This corresponds to maximum and minimum cardinality of one.
What does minimum cardinality of zero indicate?
NULL
values are allowed
This means that columns can contain no value when no keyword is specified.
What is the maximum cardinality of one?
It refers to the enforcement of unique attributes and relationships
This means each instance can only relate to one instance of another entity.
What is redundancy?
- Redundancy is the repetition of related values in a table.
- When related values are updated, all copies must be changed, which makes queries slow and complex.
- If copies are not updated uniformly, the copies become inconsistent and the correct version is uncertain.
What does Fourth Normal Form eliminate?
Multivalued dependencies and associated redundancy
A multivalued dependency occurs when two independent plural attributes are implemented in the same table.
What must be done to achieve Fourth Normal Form?
Independent plural attributes must be implemented in separate tables.
What does Boyce-Codd Normal Form eliminate?
All redundancy arising from functional dependencies
Redundancy occurs when a column depends on a non-unique column.
What is the least restrictive normal form?
First Normal Form
What does First Normal Form allow?
The most types of redundancy
What does Fifth Normal Form eliminate?
Join dependencies and associated redundancy
A join dependency occurs when a table is the join of other tables with fewer columns.
What must be done to achieve Fifth Normal Form?
The larger table must be replaced by the smaller tables
What are the most general types of dependencies?
Join dependencies which include functional and multivalued dependencies.
Which Join dependencies are less common?
Join dependencies that are not functional dependencies are less common.