Normalization Flashcards

Question 1

Q

4 Rules of Normalization

Answer

A

1) No redundancy of facts
2) No Cluttering of facts
3) Must preserve information
4) Must preserve functional dependencies

Question 2

Q

Problems with Updating Non Normalized Relations

Answer

A

If updating a property, any rows with duplicates will have to be updated for consistency, and that causes problems if it’s missed.

Question 3

Q

Problems with Deletion from Non Normalized Relations

Answer

A

Deleting a row could potentially affect an application expecting data back from some other part of the relation. IE.

User born in 1978 has Salary 62k. If you want the birth year corresponding to 62k, deleting the User could have nasty side effects.

Question 4

Q

Problems with Inserting into Non Normalized Relations

Answer

A

If we have many columns that are non normalized, we may have functional dependencies we’d like to write in but cannot without creating NULL values elsewhere.

IE. Users have Username, Birth Year, and Salary properties. However I just want to represent that Birth Year relates to a certain Salary. I can’t do that without a null Username column.

Question 5

Q

Problems with Redundancy in Non Normalized Relations

Answer

A

Repeating certain pieces of information to avoid multivalue problem can cause inconsistency.

Question 6

Q

Information Loss Problem

Answer

A

If we decompose a Relation into two relations, we could get too many rows back when we recombine them. Causes an inability to distinguish fact vs ficition in the db.

IE. User has Email, Interest, SinceAge, BirthYear, Salary, Current City.
Current City and Salary can have their own relation, and Email, Interest, SinceAge, Birth Year, and Current City.
However joining these could produce more rows than are pertinent due to CurrentCity not having an identifier where to go.

Question 7

Q

Dependency Loss

Answer

A

When splitting up a Non Normalized Relation into multiple relations, we may lose the ability to enforce Functional Dependencies in the process. This says that the Normalization was done incorrectly.

Question 8

Q

Functional Dependencies

Answer

A

A property or set of properties that determine another property or set of properties.

Email uniquely determines City and BirthYear
Email + Interest uniquely determine SinceAge.

These cannot be broken in a series of relations otherwise the solution isn’t Normalized.

Question 9

Q

Fully Functional Dependencies

Answer

A

A functional dependency x -> y where y is functionally dependent on x and y is not functionally dependent on any propert subset of x.

ABC --> D is fully functional if
! A --> D
! B --> D
! C --> D
! AC --> D
etc etc. 
Only ABC together determines D

Question 10

Q

Keys and Functional Dependencies

Answer

A

Using keys to enforce uniqueness helps to enforce functional dependencies (x–>y)

Question 11

Q

Non Normal Form

Answer

A

(NF^2) no data normalized

Question 12

Q

1st Normal Form

Answer

A

(NF1) all attributes are atomic (no composite values)

Question 13

Q

2nd Normal Form

Answer

A

(2NF) 1NF + every nonkey attribut is fully dependent on the key.

Question 14

Q

3rd Normal Form

Answer

A

(3NF) 2NF + every nonkey attribute is non-transitively dependent on the key

Question 15

Q

Boyce Codd Normal Form

Answer

A

(BCNF) 3NF + every determinant is a candidate key.

ie. For each FD X –> Y, X must be a Super Key

Question 16

Q

Determinant

Answer

A

a set of attributes on whice some other attribute is fully functionally dependent.

Question 17

Q

Kent and Diehr quote

Answer

A

All attributes must depend on the key (1NF), the whole key (2NF), and nothing but the key (3NF) so help me Codd!

Question 18

Q

Armstrongs Reflexivity Rule

Answer

A

if Y is part of X, then X –> Y

ex. Email, Interest –> Interest

Question 19

Q

Armstrong’s Augmentation Rule

Answer

A

If X –> Y, then WX –> WY
ex. If Email –> BirthYear then
Email, Interest –> BirthYear, Interest

Question 20

Q

Armstrong’s Transitivity Rule

Answer

A

If X –> Y and Y –> Z then X –> Z
Email –> BirthYear and BirthYear –> Salary
then Email –> Salary

Question 21

Q

How to guarantee lossless joins

Answer

A

The join field must be a key in at least one of the relations

Question 22

Q

How do we guarantee preservation of FDs?

Answer

A

The meaning implied by the remaining functional dependencies must be the same as the meaning implied by the original set. Also using Armstrong’s Rules.

Question 23

Q

Normalization Guideline 1

Answer

A

Design relation schema so that it’s easy to explain its meaning.

Question 24

Q

Normalization Guideline 2

Answer

A

Design the base relation schema so that no insertion, deletion, or modification anomalies are present.

Question 25

Q

Insertion Anomalies

Answer

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where Inserting a row for a single entity forces an implementer to fill in data non pertinent entity sharing the row, or Insert NULL values for those columns.

Inserting NULLs for PKs is impossible so this has to be considered.

Question 26

Q

Deletion Anomalies

Answer

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where deleting a row to eliminate one entity may completely delete another entity sharing the row.

Question 27

Q

Modification Anomalies

Answer

A

Where a poorly designed table with multiple entities in multiple rows creates a situation where updating an entity forces an implementer update all rows that entity exists in.

Failing to update all of those rows could create consistency issues later on.

Question 28

Q

Normalization Guideline 3

Answer

A

Avoid placing attributes in a base relation whose values may frequently be NULL.

Question 29

Q

NULL Values in Tuples

Answer

A

NULL Values waste space which can complicate on Logical Level

NULL Values can be interpreted in different ways

1) Attribute doesn’t apply to tuple
2) Attribute value is unknown for this tuple
3) Value is known but absent from the tuple

Question 30

Q

Normalization Guideline 4

Answer

A

Design relation schema so that they can be joined with equality conditions on attributes that are appropriately related.

Question 31

Q

Closure of an Attribute

Answer

A

The set of attributes determined by the attribute in question.
Denoted with a superscript plus sign or Cl( x ) | x = attribute in question.

ie. R( A B C D)
A –> B
B –> D
C –> B

A+ –> A B D
Cl ( A ) = {A B D}

Question 32

Q

Candidate Key

Answer

A

A key not determined by any other attribute, and determines every other attribute.
(Pro Tip, when looking at a set of FDs, candidate keys will always be attributes not present in the right-hand side)

Question 33

Q

Prime Attribute

Answer

A

an attribute belonging to a candidate key.

ie R( A B C D )
AB –> C
C –> D
A and B are prime attributes

Question 34

Q

Non-Prime Attribute

Answer

A

an attribute in a relation not belonging to a candidate key

ie R( A B C D )
AB –> C
C –> D
C and D are non-prime attributes

Question 35

Q

Super Key

Answer

A

Specifies that no two distinct tuples can have the same value for Super Key (ie the combinations of attributes that make up SK will never be the same twice and identify distinct rows)