Main Flashcards
What is the Entity-Relationship Model?
The entity-relationship (ER) data model uses a collection of basic objects, called entities, and relationships among these objects.
An entity is a “thing” or “object” in the real world that is distinguishable from other objects. The entity-relationship model is widely used in database design.
What are the languages of Databases?
- Data-definition language (Specify database schema)
- Data-manipulation language (Database queries)
In practice: both are in the same database language, e.g. SQL
What is a Database System?
A Database Management System and a Database
What is the 3-layer schema architecture?
What is a conceptual schema
The conceptual schema defines the ontology (the formal naming and definition of the types, properties, and interrelationships of the entities) of the concepts as the users think of them and talk about them.
What is an external schema?
The external schema defines the view of the data presented to the application programs.
What is an internal schema?
Describes the internal formats of the data stored in the database.
What is Physical data independence?
Changes regarding file structure and access paths (physical layer) have no influence on the conceptual schema (logical layer).
What are some examples of changes on the physical layer?
- New hard disk added
- New processor added
- Files are split into multiple files
What is Logical data independence?
Changes on the logical layer have no infuence on external schemas and applications.
What are some examples of changes on the logical layer?
- Adding another attribute to the conceptual schema
- Changing the name of an attribute on the conceptual schema
What is a Mini-world?
Some part of the real world about which information is stored
What are data/information?
Known facts about the mini-world that can be recorded and have an implicit meaning
What is a database (DB)?
A collection of related data
What are Database Management Systems (DBMS)?
A software package to facilitate the creation and maintenance of a database
Examples: MySQL, PostgreSQL, Microsoft SQL Server, Oracle, etc.
What is a database instance?
The content of a DB at a particular time
Name some of the problems that can occur when managing data without a DBS
- Data can get lost.
- Query evaluation might be very slow, especially for large amounts of data.
- Access rights are hard to control.
- Extending the data with additional attributes is difficult.
- Inconsistent data might be stored.
- Concurrent access to the data cannot be handled efficiently.
Definition of a relation
Assume D1, D2,…,Dn are domains
R ⊆ D1 x … x Dn
- Example: telephoneBook ⊆ string x string x integer
Relational schema
Defines the structure of stored data
- Is denoted as: sch(R) or R
- Notation: R(A1 : D1, A2 : D2, …) with Ai denoting attributes
In what order are tuples in a relation stored?
Tuples in a relation are unordered.
These relations have the same information content:
In what order are attributes in a touple/relation stored?
In accordance with the mathematical definition of tuples, attributes in a tuple/relation are ordered.
These relations have different information content:
Values in a tuple
- Values in a tuple are atomic (indivisible)
- A value cannot be a structure, a record, a collection type or a relation
Null values
A special null value is used to represent values that are unknown or inapplicable to certain tuples.
When are duplicates not allowed?
A relation adheres to the mathematical definition of a set.
No two tuples in a relation may have identical values for all attributes
What does union-compatible mean?
Two relations are union-compatible if:
- They have the same number of attributes (they have the same arity)
- The domain of each attribute in column order is the same in both relations
(the i’th attribute of both R and S have the same domain)
Relational algebra operator: π(pi)
Projection: The result is a relation of n columns obtained by removing the columns that are not specified.
Relational algebra operator: σ(sigma)
Selection: Selects/filters rows of a table according to the selection predicate
- Notation: σF
- Selection predicate F consists of:
- Logic operators: \/(or), /(and), ¬(not)
- Arithmetic comparison operators: <, ≤, =, >, ≥, ≠
- Attribute names of the argument relations or constants as operands
Relational algebra operator: ρ(rho)
Rename:
- Renaming relation R to S: ρS(R)
- Renaming attribute B to A: ρA←B(R)
Relational algebra operator: X
Cartesian product (aka. cross product)
The cartesian product (R x S) between relations R and S consists of all possible combinations (|R| * |S| pairs) of tuples from both relations.
Attributes in the result are referenced as R.A or S.A to resolve ambiguity.
Relational algebra operator: U
Union: The result of a union (R U S) between two relations R and S contains all tuples from both relations without duplicates.
R and S have to be union-compatible.
Relational algebra operator: —
Difference: The difference (R — S or R \ S) of two relations R and S removes all tuples from the first relation that are also contained in the second relation.
R and S have to be union-compatible.
Relational algebra operator: ∩
Intersection: The intersection (R ∩ S) of two relations R and S consists of a set of tuples that occur in both relations.
R and S have to be union-compatible.
Relational algebra operator: ⋈
Natural join: The natural join combines two relations via common attributes (same names and domains) by combining only tuples with the same values for common attributes.
Relational algebra operator: ⟕
Left outer join: keep dangling tuples in the left operand relation
Relational algebra operator: ⟖
Right outer join: keep danging tuples in the right operand relation
Relational algebra operator: ⟗
(Full) outer join: keep dangling tuples of both operand relations
Relational algebra operator: ⋉ and ⋊
Left semi join: ⋉ Right semi join: ⋊
Find all tuples in a relation for which there are matching tuples in the other relation.
Relational algebra operator: γ (gamma)
Grouping: tuples with the same attribute values (for a specific list of attributes) are grouped. An aggregate function is applied to each group (computing one value for each group).
Typical aggregate functions:
- count - number of tuples in a group
- sum - sum of attribute values in a group
- min, max, avg - minimum, maximum and average of attribute values in a group
Relational algebra operator: ÷
Division
Example: Find all wines that have been reviewed by both Parker and Clarke
θ-join (Theta join)
To combine tuples from two relations where the combination condition is not simply the equality of shared attributes then it is convenient to have a more general form of join operator, which is the θ-join.
You can use any binary operator in the set {<, ≤, =, >, ≥} in between the two variables,
Example: price < myMoney or price < 800
When is a variable free in domain relational calculus?
A variable is free if:
- It is not quantified by a universal quantifier
- It is not quantified by an extential quantifier
Universal quantifier
A universal quantification is a type of quantifier, a logical constant, which is interpreted as (∀) or (⇒).
Existential quantifier
An existential quantification is a type of quantifier, a logical constant, which is interpreted as “there exists” (∃)
Universal quantifier: ⇒
P ⇒ Q means “P implies Q”, i.e., “if P is true, then Q must be true”
Domain relational calculus: When is a query safe?
When it returns only a finite number of tuples
What is the expressiveness of relational algebra, tuple relational calculus, and domain relational calculus?
All three languages are equal regarding expressiveness (relationally complete)
How is an attribute marked as a primary key?
By having a line under it
ER model: Entities and entity types
Entities are objects of the real world about which we want to store information
Entities are grouped into entity types (represented as a square).
ER model: Entity set
An entity set is a set of entities of the same type (e.g., all persons having an account at a bank).
ER model: Attributes
Attributes model characteristics of entities or relationships.
- All entities of an entity type have the same characteristics
- Attributes are declared for entity types
- Attributes have a domain or value set
ER model: Single-valued vs multi-valued attributes
Ex: A person might have multiple phone numbers (or a single one)
ER model: Simple attributes vs composite attributes
Ex: An address can be modeled as a string or composed of street and city
ER model: Stored attributes vs derived attributes
Ex: Birthday and age.
Age is computed several times using birthday.
Functional dependency
- A functional dependency is a constraint between two sets of attributes in a relation from a database.
- In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
- Given a relation R, a set of attributes X in R is said to functionally determine another set of attributes Y, also in R, (written X → Y) if, and only if, each X value in R is associated with precisely one Y value in R; R is then said to satisfy the functional dependency X → Y.
ER model: Primary keys
Primary key attributes are marked by underlining.
Super key
A set of attributes which can uniquely identify an entity. Example:
- Finding the super key of a Person(name, address, licencePlateNumber, phoneNumber)
- We consider that the same phoneNumber can appear in several countries so phone number is not enough to uniquely identify a person
- But the key: (address, phoneNumber, name) is a super key
Remember that super keys are not necessarily minimal.
Both: (address, phoneNumber) and (address, phoneNumber, name) are super keys
Candidate key
A canditate key is a minimal super key (if you remove any attribute from the super key, then you cannot uniquely identify the entity)
One of the candidate keys for an entity is chosen as the primary key.
ER model: Relationships and relationship types
Relationships describe connections between entities.
Relationships between entities are grouped into relationship types.
ER model: Relationship instance and relationship set
An association between two or more entities is called relationship (instance). A relationship set is a collection of relationship instances.
ER model: Role names
Role names are optional and used to characterize a relationship type.
Especially useful for recursive relationship types, i.e., an entity type is participating multiple times in a relationship type.
ER model: Descriptive attributes
Relationship types can also have (descriptive) attributes.
Characteristics of relationship types: Degree
- Number of participating entity types
- Mostly: binary
- Rarely: ternary
- In general: n-ary or n-way (multiway relationship types)
Characteristics of relationship types:
- Cardinality ratio
- Cardinality limits
- Participation constraint
- Cardinality ratio (Chen notation): 1:1, 1:N, N:M
- Cardinality limits ([min,max] notation): [min,max]
- Participation constraint: partial or total
Total participation constraint
Each entity of an entity type must participate in a relationship, i.e., it cannot exist without any participation.
- Double line means that it has to participate
- Example:
- Wine has to be produced
- A producer does not have to produce
Partial participation constraint
Each entity of an entity type can participate in a relationship, i.e., it can exist without any participation.
- Single line means partial
ER model: Weak entity types
A weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to.
- The weak entity type’s key attributes are marked by underlining with a dashed line (partial key, discriminator).
- The weak entity is marked by a double line around it
- Total participation of the weak entity type.
- Only in combination with 1:N (N:1) (or rarely also 1:1) relationship types
- The strong entity type is always on the ‘1’-side
ER model: ISA relationship type
Specialization and generalization is expressed by the isa relationship type (inheritance).
Mapping relationship types to relations: M:N relationship type
- New relation with relationship type’s attributes
- Add attributes referencing the primary keys of the involved entity type relations
- Primary key: set of foreign keys
Mapping relationship types to relations: 1:N relationship type
- Add information to the entity type relation of the “N”-side:
- Add foreign key referencing the primary key of the “1”-side entity type relation
- Add attributes of the relationship type
Mapping relationship types to relations: 1:1 relationship type
- Add information to one of the involved entity type relations:
- Add foreign key referencing the primary key of the other entity type relation
- Add attributes of the relationship type
Mapping relationship types to relations: N-ary relationship type
All participating entity types are mapped according to the standard rules.
Mapping ERD to relations: How do we handle multi-valued attributes?
Create a separate relation for each multi-valued attribute.
Example:
person(PID, name)
phoneNumber: (PID –> person, number)