CH 4- Database Design Using Normalization Flashcards
Counting Rows in a Table
•Use the COUNT(*) function to count the number of rows in a table.
SELECT statement
- to determine the number and type of the table’s columns.
–If there are a large number of rows you can limit the number or rows returned by using the TOP function
as shown on the next slide.
Type of Databases
1) Updateable database- operational databases of the company
– If updateable database, we normally want tables in BCNF.
2) Read-only databases - extracted from operational DB to re-format data
– If read-only database, we may not use BCNF tables.
Designing Updateable Databases
-Updatable databases are typically the operational databases of a company, such as the online transaction
processing (OLTP) system.
-If you are constructing an updatable database, then you need to be concerned about modification anomalies and inconsistent data.
• Consequently, you must carefully consider normalization principles.
Advantages of Normalization
– Eliminate modification anomalies – Reduce duplicated data ▪ Eliminate data integrity problems ▪ Save file space – Single table queries will run faster
Disadvantages of Normalization
– More complicated SQL required for multitable subqueries and joins
– Extra work for DBMS can mean slower applications
Multivalued Dependencies
Anomalies from multivalued dependencies are very
problematic.
• Always place the columns of a multivalued dependency into a separate table (4NF).
Common Design Problems
1) The multivalue, multicolumn problem
2) Inconsistent values
3) Missing Values
4) General-purpose remarks column
multivalue, multicolumn problem
- when multiple values of an attribute are stored in more than one column.
- The solution is to use a separate table to store the multiple values
Inconsistent values
-occur when different users or
different data sources use slightly different forms of the
same data value:
– Different codings:
▪ SKU_Description = ‘Corn, Large Can’
▪ SKU_Description = ‘Can, Corn, Large’
▪ SKU_Description = ‘Large Can Corn‘
– Different spellings:
▪ Coffee, Cofee, Coffeee
Inconsistent Values
Particularly problematic are primary or foreign key values.
• To detect:
– Use referential integrity check already discussed for checking keys
– Use the SQL GROUP BY clause on suspected columns
Missing Values
A missing value or null value is a value that has never been provided.
– In a database table, a null value appears in upper case letters as NULL.
Null Values
1)Null values are ambiguous:
– May indicate that a value is inappropriate
2)May indicate that a value is appropriate but unknown
3)May indicate that a value is appropriate and known, but has never been entered
▪ DateOfLastChildbirth is appropriate for a female, and may be known but no one has recorded it in the database
Checking for Null Values
Use the SQL IS NULL operator to check for null values
General-Purpose Remarks Column
is a column with a name such as:
– Remarks
– Comments
– Notes
• It often contains important data stored in an inconsistent, verbal, and verbose way
Such a column may:
– Be used inconsistently
– Hold multiple data items