Unit 3 - Relational Model Flashcards
Data model
Collection of high-level data description constructs (e.g., tables and links) that hide many low-level storage details
Relational model
Data is in table(s) (aka RELATIONS, not relationships)
What do rows and columns of relation represent?
Rows are tuples or individual records and columns are attributes (or fields)
Can two relations have the same name?
No, name of relation must be distinct
Can two attributes have the same name if they’re in different relations?
Yes.
In the same relation, each attribute must have a distinct name (we cannot have two or more attributes with the same name)
However, two attributes in different relations can have same name.
Can a cell in a relation have a composite value?
No - each cells contains exactly one atomic (single) value
Can you have different types of values in a column?
No - values of an attribute are all from the same domain
Can you have two duplicate tuples?
No - no duplicate tuples.
What order must the attributes be in?
None. Order of attributes has no importance.
What order must the tuples be in?
None. Tuples can appear in any order (but order may impact efficiency)
What’s a schema?
Header of table: relation’s name, name of each field/attribute, domain of each field
e.g., Students (sid : string , name : string , age : integer , gpa : real)
From ER to relations - Step 1: Translate Entity Sets to Relations Steps
Step 1: each atomic attribute of an entity set becomes an attribute (field) in the relation schema; entity set name = relation name; underline primary key
How do you translate a composite attribute in an ER diagram to schema?
Concatenate the root attribute + composite attribute (order of concatenation doesn’t matter) and that becomes the attribute/field
e.g., let’s say Name is composite and its atomic attributes are First and Last. The employee relation is,
Employee (SSN : integer, First_Name : string, Last_Name : string)
What do derived attributes look like in a relational schema?
They are NOT included
What if there’s a multi-value attribute in the ER? (e.g., multiple email addresses associated with an employee)
You should break that table into two tables:
1. original table minus the email column, and
2. each email address + identifier
When we create more tables from an existing table, what becomes the primary key in the new tables?
Usually it’s a concatenated key (e.g., Skillset + SSN), but remember one of the rows is primary key from the previous table (so it’s a FOREIGN key)
Don’t forget to add an arrow from foreign key to primary key!
What are the steps to translate entity sets to relations (tables)?
Step 1: place all single-value attributes in the main relation
Step 2: for each multi-value attribute, create a new relation; determine the composite primary key in new relation (multi value attribute is partial key + primary key from prev table is foreign key)
Step 3: Add an arrow from foreign key (of the new relation) to primary key (of the main relation)
How to translate relationship set to relation?
Create schemas, then add the primary key of one of the schemas as a foreign key to the other schema
How to translate M:N relationship set to relation?
Create a third schema/table that represents the relationship set - its composite primary key will be the primary keys of the two schemas.
Example: you have entity set S, relationship set ‘makes’ and weak entity set T.
What is the primary key for weak entity set T?
It’s a composite primary key:
Primary key S + primary key of T
^ this is foreign key of T (from S)
What are the three main problems of data redundancy?
Insertion, Update, Deletion
^ Anomalies
What is Insertion Anomaly with data redundancy?
For all 1000 records, the same data must be input correctly 1000 times.
What is Update Anomaly with data redundancy?
If one field changes, and there are 1000 records, must update it 1000 times.
If we miss one, data inconsistency.
What is Deletion Anomaly with data redundancy?
Let’s say each record has info on the student and the department.
If the last student is deleted, we lose all stored info about the department!
What is a non-normalized table?
A table with data redunancy
What is normalization?
Techniques/ rules that divide the table into sub-tables (to reduce redunancy)
What is denormalization?
Redundant data is ADDED to improve performance
Tables are COMBINED
First Normal Form (1NF) - what should a table have in order to be in the 1NF?
GET RID OF ARRAYS
Each cell should only hold single (atomic) attributes
Columns must have same domain
All columns should have unique names
Second Normal Form (2NF) - what should a table have in order to be in 2NF?
Must be in 1NF
Should NOT have partial dependency
What is partial dependency (2NF)?
Primary key MUST be a single attribute.
If composite primary key, then each non-key attribute must be FULLY dependent on the ENTIRE primary key and not a subset of the primary key.
Third Normal Form (3NF) - what should a table have in order to be in 3NF?
Must be in 2NF
It doesn’t have transitive dependency
What is transitive dependency (3NF)?
When a non-key attribute is functionally dependent on another non-key attribute
How to solve transitive dependency?
Transitive dependency is when a non-key attribute is functionally dependent on another non-key attribute.
Put both those attributes in a separate table. (can keep one in previous table too)
Do you want to reduce data redundancy (i.e., do you want to normalize a table)?
It depends.
Sometimes, having data redundancy (and denormalizing tables) is better
You gain query performance - if you normalize, you’d have to use ‘join’ operator which is costly