Lesson 2 Flashcards
Data modeling
The process of creating a specific data model for a determined problem domain
Problem domain
A clearly defined area within the real-world environment, with a well-defined scope and boundaries that will be systematically addressed
A Data model:
- A representation, usually graphic, of a complex “real-world” data structure.
- Data models are used in the database design phase of the Database Life Cycle
Entity
A person, place, thing, concept, or event for which data can be stored
Attribute
- A characteristic of an entity or object.
- An attribute has a name and a data type.
One-to-many (1:M or 1…*) relationship
- Associations among two or more entities that are used by data models.
- In a 1:M relationship, one entity instance is associated with many instances of the related entity.
Many-to-many (M:N or …) relationship
Association among two or more entities in which:
- one occurrence of an entity is associated with many occurrences of a related entity &
- one occurrence of the related entity is associated with many occurrences of the first entity.
One-to-one (1:1 or 1…1) relationship
1:1 means only two entities are involved, and each instance in one entity is related to only one instance in the other
Constraint
A restriction placed on data, usually expressed in the form of rules
- Each record from first table is associated with many records in second table
- But each record in second table is associated with one record in first table.
What type of relationship is it?
One-to-many (1:M) relationship
Single record in the first table is related to only one record in the second table and vice versa.
What type of relationship is it?
One-to-one (1:1) relationship
Each record from first table is associated with many records in second table and one record in second table is associated with many records in first table
What type of relationship is it?
Many-to-many (M:M) relationship
Business rule
For example, a pilot cannot be on duty for more than 10 hours during a 24-hour period
A description of a policy, procedure, or principle within an organization.
Hierarchical model
An early database model whose basic concepts and characteristics formed the basis for subsequent database development
Segment
In the hierarchical data model, the equivalent of a file system’s record type
Network model
An early data model that represented data as a collection of record types in 1:M relationships.
Schema
A logical grouping of database objects, such as tables, indexes, views, and queries that are related to each other.
Subschema
The portion of the database that interacts with application programs.
Data manipulation language (DML)
The set of commands that allows an end user to manipulate the data in the database, such as SELECT, INSERT, UPDATE, DELETE, COMMIT, and ROLLBACK.
Data definition language (DDL)
The language that allows a database administrator to define the database structure, schema, and subschema
Relational database
a collection of relations that contain the data describing a particular business environment.
Relational model
- Each relation (table) is conceptually represented as a two-dimensional structure of intersecting rows and columns.
- The relations are related to each other through the sharing of common entity characteristics (values in columns).
Relation
A logical construct perceived to be a two-dimensional structure composed of intersecting rows (entities) and columns (attributes) that represents an entity set in the relational model.
Tables are somtimes called ______
Relation
Tuple
a table row
In the relational model
Relational database management system (RDBMS)
- A collection of programs that manages a relational database.
- The RDBMS software translates a user’s logical requests (queries) into commands that physically locate and retrieve the requested data
Relational diagram
A graphical representation:
- A relational database’s entities
- The attributes within those entities (fields)
- & the relationships among the entities.
Entity relationship (ER) Model
- A data model that describes relationships among entities at the conceptual level. with the help of ER diagrams.
- The model was developed by Peter Chen.
(1:1, 1:M, and M:N)
Entity relationship diagram (ERD)
A diagram that depicts an entity relationship model’s entities, attributes, and relations.
Entity instance (entity occurrence)
A row in a relational table
Entity set
A collection of like entities
Connectivity
The type of relationship between entities. Classifications include 1:1, 1:M, and M:N.
What are the 3 types of ER notations?
- Chen notation
- Crow’s foot notation
- class diagram notation
Crow’s Foot notation
A representation of the entity relationship diagram that uses a three-pronged symbol to represent the “many” sides of the relationship.
In relational database tables, a _____ describes a row and an ______ describes a column of that table.
Tuple; Attribute
OODM
- Object-oriented data model
- A data model whose basic modeling structure is an object
Object
An abstract representation of a realworld entity that has a unique identity, embedded properties, and the ability to interact with other objects and itself
Object-oriented database management system
- OODBMS
- Data management software used to manage data in an object-oriented database model.
Semantic data model
- The first of a series of data models that more closely represented the real world.
- Models both data and their relationships in a single structure known as an object.
Class
- A collection of similar objects with shared structure (attributes) and behavior (methods).
- A class encapsulates an object’s data representation and a method’s implementation.
Classes are organized in a class hierarchy.
Method
OODM
- In the object-oriented data model, a named set of instructions to perform an action.
- Methods represent real-world actions, and are invoked through messages
Class hierarchy
The organization of classes in a hierarchical tree.
- Each parent class is a superclass
- Each child class is a subclass.
Inheritance
In the object-oriented data model, the ability of an object to inherit the data structure and methods of the classes above it in the class hierarchy.
UML
- Unified Model Language
- A language based on object-oriented concepts that provides tools such as diagrams and symbols to graphically model a system.
Class diagram
A diagram used to represent data and their relationships in UML object notation.
_____ is a highly distributed, fault-tolerant file storage system designed to manage large amount of data at high speed.
HDFS (Hadoop Distributed File System)
ERDM
Extended Relational Data Model
- A model that includes the object-oriented model’s best features.
- Inherently simpler relational database structural environment.
Object/Relational DBMS (O/R DBMS)
- A DBMS based on the extended relational model (ERDM). The ERDM, championed by many relational database researchers, constitutes the relational model’s response to the OODM.
- This model includes many of the object-oriented model’s best features within an inherently simpler relational database structure.
Big Data
- A movement to find new and better ways to manage large amounts of web-generated data and derive business insight from it.
- Simultaneously providing high performance and scalability at a reasonable cost.
What are the 3 V’s in big data databases
- Volume
- Velocity
- Variety
Hadoop
- A Java based, open source, high speed, fault-tolerant, distributed storage with a computational framework.
- It uses low-cost hardware to create clusters of thousands of computer nodes to store and process data.
HDFS
- Hadoop Distributed File System
- A highly distributed, fault-tolerant file storage system designed to manage large amounts of data at high speeds.
What are the three types of nodes that HDFS uses?
- Name node
- Data node
- Client mode
Name node
Stores all the metadata about the file system
Data node
The data node stores fixed-size data blocks
Client node
Acts as the interface between the user application and the HDFS
MapReduce
- API that allows organizations to process massive data stores.
- An open-source (API) that provides fast data analytics services.
application programming interface
NoSQL
A new generation of database management systems that is not based on the traditional relational database model.
Key-value (Data model)
composed of two data elements:
- A key and a value.
- Every key has a corresponding value or set of values.
AKA associative or attribute-value data model.
Sparse data
A case in which:
- The number of table attributes is very large.
- However the number of actual data instances is low.
Eventual consistency
A model for database consistency in which updates to the database will propagate through the system so that all data copies will be consistent eventually
In Chen notation, entities and relationships have to be oriented $ ?
- In Chen notation, entities and relationships can be oriented either horizontally or vertically.
- There is no strict rule requiring a specific orientation.
A(n) _______ is anything about which data are to be collected and stored.
entity
Even when a good database blueprint is available, how should it be implemented?
A good database blueprint acts as a common language for everyone who interacts with the data, ensuring alignment and efficiency.
In the context of data models, an entity is _?
An entity is a person, place, thing, or event about which data will be collected and stored.
What is a disadvantage of the hierarchical data model?
It does not have standards.
In object oriented terms, a(n) _____ defines an object’s behavior
Method
The object-oriented data model was developed in the _______ . (What is the year?)
1980s
A(n) _______ enables a database administrator to describe schema components.
Data definition langauge (DDL)
A data model is usually graphical (T/F)
True
Each row in a relation is called a ____
Tuple
Which of the following types of HDFS nodes stores all the metadata about a file system?
Name node
A _____ is a collection of similar objects with a shared structure and behavior.
Class
A _____ defines the environment in which data can be managed and is used to work with the data in the database.
data manipulation language,DML
The hierarchical data model was developed in the _______.
1960s-1970s
In a SQL-based relational database, why are tables not dependent on every other table?
- Relational databases use relationships, not dependencies.
- Tables are linked by shared columns (keys) to combine data, but they don’t depend on each other to exist.
Business rules apply to not only businesses, but what else?
- Businesses & government groups
- Religious groups
- Research laboratories.
From a database point of view, the collection of data becomes meaningful only when it reflects properly defined _______
Business rules
Today, most relational database products can be classified _?
As object/relational
_______ are important because they help to ensure data integrity.
Constraints
Students and classes have a _______ relationship.
Many-to-many relationship
In essence, database designers may be experts in data management and optimization, but still rely on _ ?
- Collaboration (Key)
- Efficiency and Performance (their focus)
- Business Analysts
A _____ ____ is a brief, precise, and unambiguous description of a policy, procedure, or principle within a specific organization.
Business Rule
In _____ a three pronged symbol represents the “many” side of the relationship.
Crows foot notation
In the _______ model, each parent can have many children, but each child has only one parent.
Hierarchical
An implementation-ready data model should contain __ ?
Contain a description of the data structure that will store the end-user data.
Why must Business rules must be updated?
updates reflect any change in the organization’s operational environment
Each column in a relation represents a _____
Attribute
A verb associating two nouns in a business rule translates to a(n) _______ in the data model.
Relationship
A(n) _______ is bidirectional.
Relationship
The _______ model was developed to allow designers to use a graphical tool to examine structures rather than describing them with text.
Entity Relationship
A(n) _______ is a restriction placed on the data.
Constraint
Each row in the relational table is known as __?
Is known as an entity instance or entity occurrence
A ______ is a relatively simple representation of more complex real-world data structures.
Data Model
A(n) _______ represents a particular type of object in the real world.
Entity
Oracle 12c is an example of the _______.
XML/Hybrid data model
The relational data model was developed in the _______.
1970s
Within the database environment, what does a Data model do?
Represents data structures with the purpose of supporting a specific problem domain.
A _____ in a hierarchical model is the equivalent of a record in a file system
Segment
The relational model is hardware-dependent and software-independent. (T/F)
False
The relational model’s foundation is a mathematical concept known as a _____
Relation
A ______ is the conceptual organization of an entire database as viewed by a database administrator.
Schema
The _______ data model is said to be a semantic data model.
Object-oriented
The network model has __ level dependence.
structural level dependence
Each row in the relational table is known as a(n) ______
Entity instance
In the ____ model, the user perceives the database as a collection of records in 1:M relationships, where each record can have more than one parent.
Network
A noun in a business rule translates to a(n) _______ in the data model.
Entity
Which of the following types of HDFS nodes acts as the interface between the user application and the HDFS?
Client node
The _______ data model uses the concept of inheritance.
Object-oriented
In an SQL-based relational database, how are rows related?
rows in different tables are related based on common values in common attributes
The hierarchical model is software-independent. (T/F)
False
What is true about NoSQL databases?
- not based on the relational model and SQL
- support distributed database architectures.
- provide high scalability, high availability, and fault tolerance.
- support very large amounts of sparse data.
- are geared toward performance rather than transaction consistency.
Why are M:N relationships not appropriate in a relational model?
M:N relationships aren’t directly used in relational databases because they cause data redundancy and integrity issues, and make queries complex.
Instead, we use associative
In the _______ model, the basic logical structure is represented as an upside-down tree.
Hierarchical
Why should an implementation-ready data model find it necessary to contain enforceable rules? 3
- An implementation-ready data model must contain enforceable rules to guarantee data integrity.
- These rules ensure data accuracy, consistency, and completeness.
- Without them, the data is vulnerable to errors and inconsistencies, making it unreliable.
The _______ model uses the term connectivity to label the relationship types.
entity relationship
A(n) _______ is the equivalent of a field in a file system
attribute
MySQL is an example of the _______.
Relational data model
A(n) ______’s main function is to help one understand the complexities of the real world environment.
Model
VMS/VSAM is an example of the _______.
File system data model
What is kind of tool are business rules considered?
They can serve as a communication tool between the users and designers.
A disadvantage of the relational database management system (RDBMS) is its inability to hide the complexities of the relational model from the user. (T/F)
False
NoSQL databases provide ______ tolerance
Fault
_______ are normally expressed in the form of rules.
Constraints