Normalization Flashcards
When designing a database, at which step do you look at normalization?
In the logical step as this is where you pay more attention to the attributes
What is normalization aiming to minimize
Update and deletion inefficiencies and anomalies.
Removing data redundancy
Data normalization is a progressive process; what are the steps?
- The steps are called normal forms.
- Each step increases the efficiency of the database.
- To be in the third normal form the database has to conform to 1st and 2nd normal form.
What is the overall goal of normalization?
To find the minimum number of attributes necessary for a relation that still satisfies the requirements.
Group logical attributes in relations (tables),
Minimize update, delete anomalies
what is the main purpose of normalization
minimize the redundancy and remove Insert, Update and Delete Anomaly.
It divides larger tables to smaller tables and links them using relationships.
What is insertion anomalies?
when data is inserted wrong, for example, more information is added to a column than there should be.
What is a deletion anomaly?
deletion anomalies occur whenever deleting a row inadvertently causes other data to be deleted
when does update anomalies occur?
when updating data has to happen in multiple rows to change one single fact
What is a functional dependency?
When an attribute (B) is dependent on another attribute (A) - then attribute A can identify B.
e.g. attribute A (student_id) -> B (student_name).
Note: if you have B then you can’t be certain about A.
In words: student_name is functional dependent on studfent_id.
How can you identify functional dependencies?
Look at which attributes can be used to uniquely identify a row.
What is a candidate key
A candidate key is an attribute that can be used as a primary key, but is not - it can uniquely identify the row.
What is a partial dependency?
A partial dependency is when you have an attribute that is functionally dependent on a candidate key
What are the requirements of 1st NF? (Normal form)
- Values in the columns must be atomic
2. No multi-valued or composite attributed
What is the 2nd Normal Form?
When all non-primary keys are fully dependent on the primary key.
This means that you should remove partial dependencies. (attributes dependent on a candidate attribute)
How can you remove partial dependencies?
You take the dependent attribute and the candidate key it is dependent on and moves it into a new table.
What is a transitive dependency?
When you have a relation with the attributes A, B and C.
And C is dependent on B that is dependent on A. Then C is transitively dependent on A
What is 3NF?
A table is in 3rd NF when all attributes are mutually independent but at the same time are fully dependent on the primary key.
What is the disadvantage of normalizing the database a lot?
The more normalized data is, the more complex the queries need to be to retrieve data.
Inner joins are often needed and they are computationally expensive
What is denormalisation?
reversing normal forms to optimize the queries for example.
When do you want a denormalised database?
When you need to do reads often. Its used a lot in data warehouses or reporting tables.
Are there any other ways to denormalise other than just reversing normal forms?
Yes, several;
separate active and inactive data.
separate heavily and lightly accessed fields.
making pre-joined tables