DATA NORMALIZATION Flashcards
What does A -> B mean?
A→ B is a functional dependency. This implies that A functionally determines B such that whenever two tuples agree on A they also agree on B.
What is a superkey?
A set of attributes that contains a key is called a superkey, short for ‘superset of a key’
A superkey satisfies the condition of a key: it functionally determines all other attributes of a relation, however; it need not satisfy the condition of minimality
(It just means that you remove one attribute/column from your key. If that reduced set of columns is still a key (i.e. uniquely identifies any data row), then the previous key was not minimal. )
Explain the splitting rule
If a key functionally determines more than one tuple:
A→ B,C,D
Then splitting the functional dependency is:
A→B A→C A→D
Note: A functional dependency cannot be split on the determinant (left side)
Explain the combining rule
Suppose:
A→B A→C A→D
Combining the functional dependency would result in:
A → B,C,D
Explain the transitive rule
Suppose A→B and B→C then A→C
What is a full functional dependency?
A functional dependency A → B is a full functional dependency if removal of any attribute from A results in the dependency no longer existing
What is a partial dependency?
A functional dependency A→B is a partial dependency if there is some attribute that can be removed from A and yet the dependency still holds.
What is data normalization?
Normalization is the process of splitting/decomposing relations into well structured relations that allow users to insert, delete, and update tuples without introducing database inconsistencies. These split up relations are then linked back together using foreign keys
What is the importance of data normalization
Ensures data integrity such that all integrity constraints are satisfied
Removes data redundancy by preventing anomalies
Brings about isolation of data
Prevents duplication of data
Explain an update anomaly
This occurs as a result of data redundancy (eg a record occurs more than once in a table) and partial update (not all of those records are updated) bringing about inconsistencies
Explain an insertion anomaly
An insertion anomaly is the inability to add data to the database due to the absence of other data.
For example, assume Student_Group is defined so that null values are not allowed. If a new employee is hired but not immediately assigned to a Student_Group then this employee could not be entered into the database.
Explain a deletion anomaly
A deletion anomaly exists when data belonging to certain attributes are lost because of the deletion of data belonging to other attributes in the same relation
For example, assume Student_Group is defined so that null values are not allowed. Then if the Student_Group of a certain employee is deleted the entire record of that employee will also be deleted.
(A delete anomaly is the opposite of an insert anomaly. When a delete anomaly occurs it means that you cannot delete data from the table without having to delete the entire record.)
What is unnormalized form?
A relation that contains one or more repeating groups( attributes with multiple values for a single occurrence) is said to be in Unnormalized form
Describe First normal form 1NF
A relation in which the intersection between each row and column contains only one value is said to be in 1NF. It has no composite attributes or repeating groups.
What are the two ways of moving from UF to 1NF?
- Flattening (Entering appropriate data in the empty attributes of tuples containing the
repeating data) Ensures atomicity
2.Placing the repeating data along with a copy of the original key attribute(s) in a separate relation