FDs Flashcards
How to identify “bad” schemas and improve them
- express constraints on the data: functional dependencies
2. use FDs to decompose the relation
Normalization
obtains a schema in the “normal form” that guarantees certain properties
Functional dependencies (FDs)
form of constraint that generalizes the concept of keys
A functionally determines B
if 2 tuples agree on the attributes A = A1, A2…An then they must agree on the attributes B = B1, B2, …Bm
A,B -> C,D can be split how?
A,B -> C
A,B -> D
Attributes on right are independently determined by A,B
A,B -> C,D
= A-> C,D and B->C,D?
cannot split this way
Trivial FDs
X->A
the attribute A belongs to the attribute set X
examples of trivial FDs
A -> A
A, B, C -> C
FD is domain knowledge
- inherent property of the application and data
- not something we can infer from a set of tuples
Given a table with a set of tuples, can we confirm FD is valid?
No, can only confirm FD seems valid or infer FD is invalid, can’t prove FD is valid
Why FDs?
- keys are special cases of FDs
- more integrity constraints
- help detect if schema has redundancies
Armstrongs Axioms
Reflexivity, Augmentation, Transitivity
Reflexivity
For any subset X in {A1..An}
A1, A2, …An -> X
examples reflexivity
A,B -> B
A,B,C -> A,B
A,B,C -> A,B,C
Augmentation
For any attribute sets X,Y,Z:
if X->Y, then X,Z -> Y,Z
examples Augmentation
A -> B implies A,C -> B,C
A,B -> C implies A,B,C -> C
Transitivity
For any attribute sets X, Y, Z: if X-> Y and Y -> Z then X -> Z
examples transitivity
A-> B and B-> C then A-> C
A->C,D and C,D -> then A->E
FD closure
the closure F+ is the set of all FDs logically implied by F
Armstrong’s axioms are:
sound
complete
Sound
any FD generated by an axiom belongs in F+
Complete
repeated application of the axioms will generate all FDs in F+
Attribute Closure
If X is an attribute set, the closure X+ is the set of all attributes B s.t. X -> B
X+ includes all attributes that are functionally determined from X
Closure Algorithm
X = {A1, A2, …An}
Until X doesn’t change repeat: if B1,…Bm -> C is an FD and B1…Bm are all in X, then add C to X
Why is closure needed?
Does X-> Y hold?
We can check if Y is in X+
To compute the closer F+ of FDs
for each subset of attributes X, compute X+, for each subset of attributes Y in X+, ouput FD X->Y
superkey
set of attributes A1, A2,…An s.t. for any other attribute B in the relation:
A1, A2,…An -> B
key
minimal super key
none of its subsets functionally determines all attributes of the relation
Compute X+ for all sets of attributes X. How do we know if X is a key or superkey?
if X+ = all attributes, X is a superkey
if no subset of X is a superkey, X is a key
Is it possible to have many keys in a relation R?
Yes. R(A,B,C) with FDs
A,B -> C
A,C -> B
S is a minimal basis for a set F of FDs if:
S+ = F+
every FD in S has one attribute on the RHS
if we remove a FD from S, the closure is not F+
if we remove 1+ attributes on LHS from any FD in S, closure is not F+