Normalization Flashcards
4 Rules of Normalization
1) No redundancy of facts
2) No Cluttering of facts
3) Must preserve information
4) Must preserve functional dependencies
Problems with Updating Non Normalized Relations
If updating a property, any rows with duplicates will have to be updated for consistency, and that causes problems if it’s missed.
Problems with Deletion from Non Normalized Relations
Deleting a row could potentially affect an application expecting data back from some other part of the relation. IE.
User born in 1978 has Salary 62k. If you want the birth year corresponding to 62k, deleting the User could have nasty side effects.
Problems with Inserting into Non Normalized Relations
If we have many columns that are non normalized, we may have functional dependencies we’d like to write in but cannot without creating NULL values elsewhere.
IE. Users have Username, Birth Year, and Salary properties. However I just want to represent that Birth Year relates to a certain Salary. I can’t do that without a null Username column.
Problems with Redundancy in Non Normalized Relations
Repeating certain pieces of information to avoid multivalue problem can cause inconsistency.
Information Loss Problem
If we decompose a Relation into two relations, we could get too many rows back when we recombine them. Causes an inability to distinguish fact vs ficition in the db.
IE. User has Email, Interest, SinceAge, BirthYear, Salary, Current City.
Current City and Salary can have their own relation, and Email, Interest, SinceAge, Birth Year, and Current City.
However joining these could produce more rows than are pertinent due to CurrentCity not having an identifier where to go.
Dependency Loss
When splitting up a Non Normalized Relation into multiple relations, we may lose the ability to enforce Functional Dependencies in the process. This says that the Normalization was done incorrectly.
Functional Dependencies
A property or set of properties that determine another property or set of properties.
Email uniquely determines City and BirthYear
Email + Interest uniquely determine SinceAge.
These cannot be broken in a series of relations otherwise the solution isn’t Normalized.
Fully Functional Dependencies
A functional dependency x -> y where y is functionally dependent on x and y is not functionally dependent on any propert subset of x.
ABC --> D is fully functional if ! A --> D ! B --> D ! C --> D ! AC --> D etc etc. Only ABC together determines D
Keys and Functional Dependencies
Using keys to enforce uniqueness helps to enforce functional dependencies (x–>y)
Non Normal Form
(NF^2) no data normalized
1st Normal Form
(NF1) all attributes are atomic (no composite values)
2nd Normal Form
(2NF) 1NF + every nonkey attribut is fully dependent on the key.
3rd Normal Form
(3NF) 2NF + every nonkey attribute is non-transitively dependent on the key
Boyce Codd Normal Form
(BCNF) 3NF + every determinant is a candidate key.
ie. For each FD X –> Y, X must be a Super Key