L5/6: Database Design Theory Flashcards
What is
Design Theory
about?
How to represent your data
in such a way that you
avoid Anomalies
caused by data redundancy, null values, etc
What are
Armstrongs 4 Rules
used for?
Discovering Functional Dependencies
from Functional Dependencies that are already known
Armstrong’s Rules
(names only)
- Reduction (trivial)
- Transitivity
- Augmentation
- Split/Combine
Armstrong’s Rules:
Reduction
Given a set of attributes: X, Y, Z
If Y is a subset of X (Y is in X),
Then X is Functionally Dependent on Y
X → Y
This is kind of trivial.
Armstrong’s Rules:
Transitivity
Given a set of attributes: X, Y, Z
If X is functionally dependent on Y (X → Y),
and Y is functionally dependent on Z ( Y → Z),
then X is functionally dependent on Z:
X → Y → Z, therefore X → Z
Armstrong’s Rules:
Augmentation
Given a set of attributes: X, Y, Z
If X is functionally dependent on Y (X → Y),
then XZ is functionally dependent on YZ ( XZ → YZ):
X → Y, therefore XZ → YZ for any Z
Armstrong’s Rules:
Split/Combine
Given a set of attributes: X, Y, Z
Combine:
If X is functionally dependent on Y (X → Y),
and X is functionally dependent on Z ( X → Z),
then X is functionally dependent on YZ:
X → Y , and X→Z, therefore X → YZ
Split:
If X is functionally dependent on YZ,(X → YZ),
then X is functionally dependent on Y ( X → Y)
and X is functionally dependent on Z ( X → Z),
X → YZ therefore: X→Y and X → Z
Good Functional Depenedencies
vs
Bad Functional Depenedencies
Good Functional Depenedency:
- Minimal Redundancy
- Less possibility of anomalies
- Example: EmployeeID → Name, Phone
Bad Functional Depenedency:
- A lot of redundancy
- May tie unrelated things together, allowing data anomalies
- Example: Position → PhoneNumber
Functional Dependencies
as Constraints
- A Functional Dependency will hold on some instances, and will not hold on others
- If designed as part of the schema, a FD can help define a valid instance
- Can check if a FD is violated by checking a single instance
- But cannot prove that a FD exists by checking a single instance
Using Functional Dependencies
for Database Schema Design:
General Steps
- Start with some Relational Schema
- Determine it’s Functional Dependencies (FDs)
- Use the FDs to design a better schema
- One which minimizes the possibility of anomalies
Definition:
Closure of a
Set of Attributes
Given a set of attributes: {A1, …, An }
and a set of FDs, F
The Closure: {A1, …, An }+
is the set of attributes B such that
{A1, …, An } → B
So, the closure is the set of attributes dependent on attributes of A
Algorithm for
Computing Attribute Closure
closure = x;
repeat until there is no change:
if there is an FD: U → V in F
such that U is a subset of the Closure,
then set closure = closure UNION V
Translation:
If an FD implies another attribute that is not in the closure, add it to the closure
Types of Anomalies
- Update Anomaly
- Deletion Anomaly
- Insertion Anomaly
- Redundancy
Anomaly Types:
Update Anomaly
Updating one tuple causes inconsistent data
Anomaly Types:
Deletion Anomaly
Deleting one kind of information makes you lose another
Anomaly Types:
Insertion Anomaly
Storing one kind of info depends on storing another
Anomaly Types:
Redundancy
Storage of multiple instances
of redundant information
What is a
Functional Dependency (FD) ?
- An Integrity Constraint
- not explicity expressed in the E/R diagram
- Occurs when one attribute in a relation uniquely determines another attribute
- Two attributes always occur in the same pair
- Example: classroom and course
- A → B
- If A, then B
Superkeys
and
Keys
Superkey
a set of attributes {A1, …, An} such that
for any other attribute B in a relation R,
{A1, …, An } → B
- All attributes are functionally dependent by a superkey
- A Key is a minimal superkey,
- no subset of the key is a superkey
Steps for finding
Keys and Superkeys
For each set of attributes X:
- Compute X+ (The Closure)
- If X+ = the set of all attributes, then X is a Superkey
- If X is minimal, then X is a Key
Functional Dependency:
Formal Definition
Let A, B be sets of attributes
We write A → B, or say “A functionally determines B”
If, for any tuples t1 and t2 :
t1[A] = t2[A] implies t1[B] = t2[B]
We call A → B a Functional Dependency
First Normal Form
(1NF)
A Relation is in First Normal Form (1NF) if:
Every field contains only Atomic Values
(No lists or sets)
This is an implicit requirement in the Relational Model.
Pretty simple: One column, one value
Second Normal Form (2NF)
A Relation R is in Second Normal Form (2NF) if:
Every non-key attribute in R is has Full Funtional Dependency on the Candidate Key of R
A non-2NF schema can be decomposed to 2NF
Full Functional Dependency
vs
Partial Functional Dependency
X → Y is a:
Full Functional Dependency
if removing attribute A from X makes X → Y not hold
Partial Functional Dependency
if after removing attribute A from X,
X → Y still holds