Lecture 2 - Relational Schema Good Or Bad Flashcards
What performance metrics may we look at when judging the goodness of a relational schema?
- goodness i.e. whether the attributes form a good relation
- what pitfalls exist
- operation efficiency
- maximize amount of information conveyed while minimizing redundancy
Guidelines for good design of a relational schema?
- attributes of a relation should make sense
- avoid redundant tuples
- relations should minimize the amount of null values held
- design relations to avoid fictitious tuples after join
Explain what is meant by the first guideline : The attributes of a relation should make sense.
Attributes of different entities should not be in the same relation, as to minimize the similarity between relations. Relationships should on be represented by Primary and Foreign keys.
What is the impact of guideline 2: Avoid redundant tuples?
Having repetition of data:
- wastes storage space
- leads to inconsistencies (as replicas must be changed when one tuple is changed) and anomalies
Why have NULL values in tuples?
- to represent inapplicable or invalid attribute for that tuple
- value is unknown
- value is known, but unavailable
What should we do to meet guideline 3?
Attributes that are frequently NULL should be placed in
separate relations to avoid wasting storage & reducing uncertainty!
What is guideline 4 : Design relations to avoid fictitious tuples after join.
Splitting a relation on the wrong attribute, means after we try to combine we may get non-existing entries!