Normalization Flashcards

1
Q

4 Rules of Normalization

A

1) No redundancy of facts
2) No Cluttering of facts
3) Must preserve information
4) Must preserve functional dependencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Problems with Updating Non Normalized Relations

A

If updating a property, any rows with duplicates will have to be updated for consistency, and that causes problems if it’s missed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Problems with Deletion from Non Normalized Relations

A

Deleting a row could potentially affect an application expecting data back from some other part of the relation. IE.

User born in 1978 has Salary 62k. If you want the birth year corresponding to 62k, deleting the User could have nasty side effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Problems with Inserting into Non Normalized Relations

A

If we have many columns that are non normalized, we may have functional dependencies we’d like to write in but cannot without creating NULL values elsewhere.

IE. Users have Username, Birth Year, and Salary properties. However I just want to represent that Birth Year relates to a certain Salary. I can’t do that without a null Username column.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Problems with Redundancy in Non Normalized Relations

A

Repeating certain pieces of information to avoid multivalue problem can cause inconsistency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Information Loss Problem

A

If we decompose a Relation into two relations, we could get too many rows back when we recombine them. Causes an inability to distinguish fact vs ficition in the db.

IE. User has Email, Interest, SinceAge, BirthYear, Salary, Current City.
Current City and Salary can have their own relation, and Email, Interest, SinceAge, Birth Year, and Current City.
However joining these could produce more rows than are pertinent due to CurrentCity not having an identifier where to go.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dependency Loss

A

When splitting up a Non Normalized Relation into multiple relations, we may lose the ability to enforce Functional Dependencies in the process. This says that the Normalization was done incorrectly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Functional Dependencies

A

A property or set of properties that determine another property or set of properties.

Email uniquely determines City and BirthYear
Email + Interest uniquely determine SinceAge.

These cannot be broken in a series of relations otherwise the solution isn’t Normalized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fully Functional Dependencies

A

A functional dependency x -> y where y is functionally dependent on x and y is not functionally dependent on any propert subset of x.

ABC --> D is fully functional if
! A --> D
! B --> D
! C --> D
! AC --> D
etc etc. 
Only ABC together determines D
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Keys and Functional Dependencies

A

Using keys to enforce uniqueness helps to enforce functional dependencies (x–>y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Non Normal Form

A

(NF^2) no data normalized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

1st Normal Form

A

(NF1) all attributes are atomic (no composite values)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

2nd Normal Form

A

(2NF) 1NF + every nonkey attribut is fully dependent on the key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

3rd Normal Form

A

(3NF) 2NF + every nonkey attribute is non-transitively dependent on the key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Boyce Codd Normal Form

A

(BCNF) 3NF + every determinant is a candidate key.

ie. For each FD X –> Y, X must be a Super Key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Determinant

A

a set of attributes on whice some other attribute is fully functionally dependent.

17
Q

Kent and Diehr quote

A

All attributes must depend on the key (1NF), the whole key (2NF), and nothing but the key (3NF) so help me Codd!

18
Q

Armstrongs Reflexivity Rule

A

if Y is part of X, then X –> Y

ex. Email, Interest –> Interest

19
Q

Armstrong’s Augmentation Rule

A

If X –> Y, then WX –> WY
ex. If Email –> BirthYear then
Email, Interest –> BirthYear, Interest

20
Q

Armstrong’s Transitivity Rule

A

If X –> Y and Y –> Z then X –> Z
Email –> BirthYear and BirthYear –> Salary
then Email –> Salary

21
Q

How to guarantee lossless joins

A

The join field must be a key in at least one of the relations

22
Q

How do we guarantee preservation of FDs?

A

The meaning implied by the remaining functional dependencies must be the same as the meaning implied by the original set. Also using Armstrong’s Rules.

23
Q

Normalization Guideline 1

A

Design relation schema so that it’s easy to explain its meaning.

24
Q

Normalization Guideline 2

A

Design the base relation schema so that no insertion, deletion, or modification anomalies are present.

25
Q

Insertion Anomalies

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where Inserting a row for a single entity forces an implementer to fill in data non pertinent entity sharing the row, or Insert NULL values for those columns.

Inserting NULLs for PKs is impossible so this has to be considered.

26
Q

Deletion Anomalies

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where deleting a row to eliminate one entity may completely delete another entity sharing the row.

27
Q

Modification Anomalies

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where updating an entity forces an implementer update all rows that entity exists in.

Failing to update all of those rows could create consistency issues later on.

28
Q

Normalization Guideline 3

A

Avoid placing attributes in a base relation whose values may frequently be NULL.

29
Q

NULL Values in Tuples

A

NULL Values waste space which can complicate on Logical Level

NULL Values can be interpreted in different ways

1) Attribute doesn’t apply to tuple
2) Attribute value is unknown for this tuple
3) Value is known but absent from the tuple

30
Q

Normalization Guideline 4

A

Design relation schema so that they can be joined with equality conditions on attributes that are appropriately related.

31
Q

Closure of an Attribute

A

The set of attributes determined by the attribute in question.
Denoted with a superscript plus sign or Cl( x ) | x = attribute in question.

ie. R( A B C D)
A –> B
B –> D
C –> B

A+ –> A B D
Cl ( A ) = {A B D}

32
Q

Candidate Key

A

A key not determined by any other attribute, and determines every other attribute.
(Pro Tip, when looking at a set of FDs, candidate keys will always be attributes not present in the right-hand side)

33
Q

Prime Attribute

A

an attribute belonging to a candidate key.

ie R( A B C D )
AB –> C
C –> D
A and B are prime attributes

34
Q

Non-Prime Attribute

A

an attribute in a relation not belonging to a candidate key

ie R( A B C D )
AB –> C
C –> D
C and D are non-prime attributes

35
Q

Super Key

A

Specifies that no two distinct tuples can have the same value for Super Key (ie the combinations of attributes that make up SK will never be the same twice and identify distinct rows)