Section A in Semester Test 1 2016 Flashcards
Unstructured data
Unstructured data Data that exist in their original (raw) state, that is, in the format in which they were collected. Unstructured data exist in a format that does not lend itself to the processing that yields information.
Structured data.
Structured data The result of taking unstructured data and formatting (structuring) such data to facilitate storage, use, and the generation of information. You apply structure (format) based on the type of processing that you intend to perform on the data. Some data might be not ready (unstructured) for some types of processing, but they might be ready (structured) for other types of processing.
Name the five types of users identified in a database system.
System administrators Database administrators Database designers Systems analysts and programmers End users
Clearly indicate the difference between a database and a table.
A table, a logical structure that represents an entity set, is only one of the components of a database.
The database is a structure that houses one or more tables and metadata. The metadata are data about data. Metadata include the data (attribute) characteristics and the relationships between the entity sets.
Indicate what it would imply if a relational database model displays both entity integrity as well as referential integrity.
Entity integrity describes a condition in which all tuples within a table are uniquely identified by their primary key. The unique value requirement prohibits a null primary key value, because nulls are not unique.
Referential integrity describes a condition in which a foreign key value has a match in the corresponding table or in which the foreign key value is null. The null foreign key “value” makes it possible not to have a corresponding value, but the matching requirement on values that are not null makes it impossible to have an invalid value.
Clearly indicate what the concept “surrogate primary key” means within the context of database design.
ANSWER:
A surrogate primary key is an “artificial” PK that is used to uniquely identify each entity occurrence when there is no good natural key available or when the “natural” PK would include multiple attributes.
A surrogate PK is also used if the natural PK would be a long text variable. The reason for using a surrogate PK is to ensure entity integrity, to simplify application development by making queries simpler, to ensure query efficiency.
Describe the difference between specialisation and generalisation.
Specialization is the top-down process of identifying lower-level, more specific entity subtypes from a higher-level entity supertype.
Specialization is based on grouping unique characteristics and relationships of the subtypes.
Generalization is the bottom-up process of identifying a higher-level, more generic entity supertype from lower-level entity subtypes.
Generalization is based on grouping common characteristics and relationships of the subtypes.
Briefly describe the concept of an entity cluster.
An entity cluster is a “virtual” entity type used to represent multiple entities and relationships in the ERD.
An entity cluster is formed
by combining multiple interrelated entities into a single abstract entity object.
An entity cluster is considered “virtual” or “abstract” in the sense that it is not actually an entity in the final ERD.
It is a temporary entity used to represent multiple entities and relationships, with the purpose of simplifying the ERD and thus enhancing its readability.
Normalisation does not eliminate data redundancy. It produces the carefully controlled redundancy that allows database table linkage. Based on the process of normalisation, indicate the rules per normalisation form that would define linked database tables as:
a) First Normal Form (1NF)
b) Second Normal Form (2NF)
c) Third Normal Form (3NF)
a) 1NF A table is in 1NF when all the key attributes are defined (no repeating groups in the table) and when all remaining attributes are dependent on the primary key. However, a table in 1NF still may contain partial dependencies, i.e., dependencies based on only part of the primary key and/or transitive dependencies that are based on a non-key attribute.
b) 2NFA table is in 2NF when it is in 1NF and it includes no partial dependencies. However, a table in 2NF may still have transitive dependencies, i.e., dependencies based on attributes that are not part of the primary key.
c) 3NFA table is in 3NF when it is in 2NF and it contains no transitive dependencies.
Briefly list and describe the three steps involved in converting data into 1NF.
Step 1: Eliminate the Repeating Groups
o Eliminate nulls: each repeating group attribute contains an appropriate data value
Step 2: Identify the Primary Key
o Must uniquely identify attribute value o New key must be composed
Step 3: Identify All Dependencies
o Dependencies are depicted with a diagram
Define the concept of a transitive dependency. Make use of a short example to support your answer.
A transitive dependency is where an attribute is determined by another attribute that is NOT part of the Primary Key
Eg. Non-key attribute = determinant, Non-key attribute = dependent.