Week 3 Notes Flashcards
Redundancy causes three kinds of anomolies
update, insertion, and deletion
Redundancy (redundant storage) stores the same data repeatedly
problem: requires more storage and more labor enter the data
update anomaly
if you change one relational variable in one area, you must change the redundant data everywhere it is stored or data will be inconsistent
deletion anomaly
you lose the last copy of some data when you delete other data
insertion anomaly
you cannot insert some data without inserting other data that may not be applicable
normalization of a relational database
process of developing tables, columns, and the dependencies of columns upon one another so they satisfy the normalization rules
each rule builds upon previous ones
a database is in the third normal form if it satisfies the first three rules
goals of normalization (similar to advantages of a relational database)
minimize redundant data, reducing inconsistent data, avoiding insert/ delete/ and update anomalies, and reducing maintenance
requirements for a table to be relational
- each table has a unique name for an owner in the database
- each column has a unique name, data type, and a domain
- each row in the table has a unique combo of values (data integrity)/ or primary key with a unique combo of values
- order of rows and columns is arbitrary
- the value at each row/column is atomic
- NO repeating columns of similar data
dependencies that cause redundancies
- NO repeating columns of similar data = first normal form (1NF)
- NO partial dependencies = (2NF)
- NO transitive dependencies = 3NF
- NO non-key attributes determine the part of the key = boyce/ codd normal form (BCNF)
- NO non-trivial multi-valued dependencies = 4NF
- NO join dependencies = 5NF
- NO remaining dependencies = domain/ key normal form (DKNF)
candidate key
minimal set of attributes which can uniquely identify a tuple
- value unique and not null for every tuple
- there can be more than one candidate key in a relation
super key
set of attributes which can uniquely identify a tuple
- adding zero or more attributes to candidate key generates super key
- candidate is super key but vice versa not true
alternate key
candidate key other than the primary key
surrogate keys
artificial key which uniquely identifies each record
- generated right before a record is inserted into a table
Normalization rule 1
no repeating groups with similar info appear in a table
- procedure: identify the repeating group, remove the repeating group, and move the repeating group into a new table (w copy of column that identifies that group)