Lecture 4: Normalisation Flashcards
What is Normalisation?
improving the efficieny of relational models
The process of Normalisation, allows us to produce a set of suitable relations taht support the data requirements of the use case adn minimise redundancy. We do this to:
- Make it easier for the user to access and maintain Data
- Take up minimal storage space on the computer
Quick Revision.
We ask these questions because the data that we want to extract is often in a different form to the way it is stored.
- What is a relational Model?
- What language creates SQL?
- What represents SQL?
- What is normalisation?
- What are queries?
- What language and diagram does Queries use?
- is a conceptual schema represnts how data is organised in a database
- SQL DDL allows us to create this in MySQL
- An ER diagram allows us to represent this and the associated constraints on paper
- The process of normalisation allows us to produce a set of suitable relations that support the data requirements of the case and minimise redundancy
- Queries are like a question that you ask a databse
- SQL and DML allows us to run these in MySQL. RA allows us to represent these on paper.
Before considering normalisation, we need to familiarise with what?
- Relational Keys - Superkey, Candidate Key, PK, FK
- Dependency
- Determinant
Explain Dependency
Is an important concept in normalisation
- Essentially, an attribute, a set of attributes, is reliant on another
- In a database you want everything in a relation (table), to be dependent on the PK, and the PK to be as simple as possible.
- This helps eliminate redundancy and update anomalies
Functional Dependency. Please explain.
in a given table, an attribute B is said to have a functional dependency on a set of attributes A (written A - B) if and only if each A value is associated with precisely one B value.
e.g. StaffNo = position
Explain Full functional Dependency
An attribute is fully functionally dependent on a set of attributes A if it is:
- functionally dependent on A &
- not functionally dependent on any proper subset of A
e.g. {staffAddress} has a functional dependency on [staffNo, staffName}, but not a full functional dependency, because it is also dependent simply on {staffNo}
What is Transitive Dependency?
is an indirect functional dependency, one in which A - C only by virtue A-B and B -C.
They are important to realise because they can cause update Anomalies!
There are 4 objectives of Normalisation. Please explain each.
- To free the collection of relations from undesirable insertion, update and deletion dependencies
- To reduce the need for restructuring the collection of relations, as new types of data are introduced, and thus increase the life span of application programs
- To make the relational model more informative to users
- To make the collection of relations neutral to the query bias, so unaticipated queries are supported.
Explain the term Redundancy
The major aim of relational database design is to group attributes into relations to minimise data redundancy.
Limiting redundancy, using normalisation, offers potential benefits including:
- Updates to the data stored in the database are achieved with minimal number of operations thus reducing the opportunities for data inconsistencies
- Minimises cost through reduction of file storage space required by the base relations
How can Anomalies occur?
Through:
- Insertion
- Deletion
- Modification
List the different normalisation steps
- 0NF
- 1NF
- 2NF
- 3NF
- BCNF (probably most difficult) stands for Boyce-Codd Normal Form
- 4NF
Explain:
0NF
1NF &
2NF
0NF: a relation that contains one or more repeating groups (the unNormalised form)
1NF: A relation in which the intersection of each row and column contains one and only one value
2NF: A relation that is in 1NF and every non-PK attribute is fully functionally dependent on the PK
Describe
3NF
BCNF
4NF
3NF: A relation this is in 1NF and 2NF in which no-non-primary-key attribute us transistively dependent on the PK
BCNF: A relation that is 3NF and every determinant is a candidate key
4NF: A relation that is BCNF and does not contain nontrivial multi-valued dependencies.
Describe the characteristics of the BCNF normalisation step
A relation is in BCNF, if and only if, every determinant is a candidate key
So what is a:
- Determinant = an attribute, or a group of attributes on which some other attribute is fully functionally dependent
- Super Key = a set of attribtues where there are no duplicate tupels in the set
- Candidate Key = a minimal super key for the relation(An attribute that can uniquely identify any tuple without referring to any other data)
essentially what the definition is saying is that there is a full functional dependency in the table, where the determinant half is not a candidate key we have a problem.
Describe the characteristics of the 4NF
recap: a 4NF is a relation that is in BCND and does not contain nontrivial multi-valued dependencies
multi-valued dependency represents a dependency between attributes (e.g. A,B and C), in a relation such that for each value of A there is a set of values for B and a set of values for C. However, the set of values for B and C are independent of each other.