Functional Dependencies Flashcards
How does E-R and Functional Dependencies relate?
What are Functional Dependencies?
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
Explain Assertions on a schema.
Explain Assertions on a domain.
What are FDs really?
What are FDs really?
• Functional dependencies represent a special kind of
constraints of a domain – dependency constraints.
• The database we design should properly capture all
constraints of the domain.
• We can use FDs to verify that our design indeed
captures the constraints we expect, and add more
constraints to the design when needed.
What’s so functional about FDs?
What’s so functional?
• X → A is a (deterministic) function from X
to A. Given values for the attributes in the
set X, we get the value of A.
• Example:
– code → name
– imagine a deterministic function f(code)
which returns the name associated with a
given code.
What is the difference between dependence and reference?
Explain Multiple Attributes on R/LHS
What’s the difference between the LHS of a FD, and a key?
FD: code, period → teacher ?
Given values for a code and a period, starting from any relation where
they appear, is it possible to reach more than one teacher value by following
keys and references?
Explain Trivial FD:s.
Is this a trivial FD?
course, period → course, name
course, period → course, name
Shorthand for
course, period → course (trivial)
course, period → name (not trivial)
Explain Inferring FD:s.
Explain closure of attribute set X.
How do you compute closure?
How to find all the implied FDs?
Simplifying trick to find F+?
Explain how to find keys. What is a super key? Minimal super key?
Schedules(code, name, period, #students,
teacher, room, #seats, weekday, hour)
Are these superkeys?
X = {code, period, weekday, hour}
Y = {code, period, weekday}
Explain Primary Keys.
Primary keys
• There can be more than one key for the
same relation.
• We choose one of them to be the primary
key, which is the key that we actually use
for the relation.
• Other keys could be asserted through
uniqueness constraints.
– E.g. for the self-referencing relation
Syntax for determining primary key for scenario:
Rooms(name, #seats)
NextTo(right, left)
Where do FDs come from?
How NOT to find FDs and why not!
Why Normalizing Databases?
Normalization of Database
Database Normalization is a technique of organizing the data in the database. Normalization is a systematic approach of decomposing tables to eliminate data redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form, removing duplicated data from the relation tables.
Normalization is used for mainly two purposes,
Eliminating reduntant(useless) data.
Ensuring data dependencies make sense i.e data is logically stored.
How can FD:s be used to detect anomalies?
What is the problem and how to fix it?
Explain Boyce-Codd Normal Form.
Explain BCFN violations using the example below:
Explan the BCFN normalization algorithm.
What does Recovery mean?
Explain Lossy joins using this example.
Explain Lossless join.
Lossless join
• Only if we decompose on proper
dependencies can we guarantee that no
facts are lost.
– Schemas from proper translation of correct
E-R diagrams get this ”for free”.
– The BCNF decomposition algorithm
guarantees lossless join.
• A decompositon that does not give
lossless join is bad.
Why not use BCNF decomposition for designing database schemas?
Why go via E-R diagrams?
Why use FDs and decomposition at all?
Why not just go via E-R diagrams?
Explain Equality Constraints
Equality constraints
• FDs don’t always give the full story.
• Equality constraints over circular
relationship paths are relatively common.
– Can sometimes – but not always – be
captured via extra references.
– Extra attributes may be needed – more on
that later…
What is a problem with BCNF?
Explain the Third Normal Form (3NF).
Explain the algorithm of 3NF.
Problem with 3NF?
3NF vs BCNF
What is X+?
X+ = Closure of X = all derivable (from X) attributes
What is F+?
F+ = Closure of F = all implied (from F) FDs
Short description of BCNF?
Boyce-Codd Normal Form (BCNF):
– The LHS (X) of every non-trivial FD (X → A) must be a superkey
Explain Decompostition.
Decomposition:
– Split up relations until normal form (e.g. BCNF) holds
– Make sure to preserve recovery!!! No lossy joins allowed
Why Normal Forms?
Normal Forms?!
• Use normal forms to detect anomalies (e.g. Redundancy)
• Use decomposition to remove anomalies
Scale of Normal Forms?
Howcome?
Explain Independencies (INDs).
Independent how?
If two tuples have the same value for X, different
values for Y and different values for the Z attributes,
then there must also exist tuples where the values
of Y are exchanged, otherwise Y and Z are not
independent! (Visualize)
Explain.
Explain Joining.
Are FD:s Independencies?
What are the possible combinations of attibutes?
Are IND rules equal FD rules?
Problems with IND?
Explain Fourth Normal Form (4NF) holistically.
Fourth Normal Form (4NF)
• The redundancy that comes from IND’s is not removable by putting the databaseschema in BCNF.
• There is a stronger normal form, called 4NF, that (intuitively) treats IND’s as FD’s when it comes to decomposition, but not when determining keys of the relation.
Explain 4NF in detail.
BCNF vs 4NF.
INDs for validation?