Week 6 - Data Normalisation Flashcards
What is Normalization
A methodology to enforce data integrity by construction
Why use normalization
Helps to:
- Preserve functional dependencies
- Avoid data redundancy and insertion/deletion anomalies
- Reduce the need for restructuring the schema
- Increase the life span of programs
What is an insertion anomaly in database design?
An insertion anomaly occurs when it is difficult to add data to a database due to the presence of redundant or incomplete information. For example, adding a new entity may require additional unrelated data to be inserted.
Can you give a practical example of an insertion anomaly?
In a database where customer orders and products are stored together, if a new customer is added who has not yet made an order, you would still need to insert a dummy order just to store the customer’s information, causing redundancy.
What is a deletion anomaly in database design?
A deletion anomaly occurs when removing data from a database causes unintended loss of other important data. For example, deleting a record could inadvertently remove details about other entities that are tied to it.
Can you give a practical example of a deletion anomaly?
In a database where employee and department information is stored together, if an employee is deleted, all the department information associated with that employee might also be removed, even if other employees in the department still exist.
What is a superkey?
A set of one or more columns that uniquely identify each row in your table. It should be functional (meaning if you know the values in the superkey columns, you can determine exactly one set of values for all the other columns)
What is a candidate key?
A minimal superkey. It is the smallest possible set of columns that can uniquely identify each row in the relation. If you remove any column, it stops being a superkey.
Who was Edgar Frank “Ted” Codd
- British computer scientist
- Father of relational model for databases
- Turing award 1981
What does 1NF ensure?
That data is organized in a way that avoids hierarchical structures (e.g. tables inside tables)
A table is in 1NF if?
Each field in a table should contain a single, non-divisible value, and the table must represent a valid relationship between sets of values, containing no duplicates rows.
What are partial dependencies?
When a non prime attribute depends on only a part of a composite candidate key.
What is a prime attribute?
An attribute that is part of a candidate key
What is a non prime attribute?
An attribute that is not part of any candidate key
A table is in 2NF if?
- It is in 1NF
- No functional dependencies (where a non-prime attribute depends on only part of a composite key)