1.3.2 Databases Flashcards
what is a database
A database is nothing more than an organised collection of data.
Organising data into a database allows for easy:
Adding
Modification
Deletion
Searching
Relational Databases
An entity is an item of interest about which information is stored. A relational database is a recognises the differences between entities by creating different tables for each entity
Flat File
A flat file is a database that consists of a single file. The flat file will most likely be based around a single entity and its attributes.
Very simple, quick to set up, require little expertise to maintain and are suitable for storing small amounts of data.
Primary Key
A primary key is a unique identifier for each record in the table.
Foreign Key
A foreign key is the attribute which links two tables together. The foreign key will exist in one table as the primary key and act as the foreign key in another.
Secondary Key
A secondary key allows a database to be searched quickly.
Normalisation
Normalisation tries to accomplish the following things:
- All field names must be unique
- Values in fields should be from the same domain
- Values in fields should be atomic
- No two records can be identical
- Each table needs a primary key
First Normal Form
There must be no attribute that contains more than a single value.
Second Normal Form
A database which doesn’t have any partial dependencies and is in first normal form can be said to be in second normal form. This means that no attributes can depend on part of a composite key.
Third Normal Form
If the database is in second normal form and contains no non-key dependencies, it is in third normal form. A non-key dependency means the attribute only depends on the value of the primary key and nothing else
Indexing
Indexing is a method used to store the position of each record ordered by a certain attribute. This is used to look up and access data quickly. The primary key is automatically indexed
Capturing Data
Data needs to be input into the database and there are various ways of doing this. The chosen method is always dependent on the context. For example, if pedestrians are participating in a survey, their responses will need to be manually entered.
Data is also captured when people pay cheques. Banks scan cheques using Magnetic Ink Character Recognition (MICR). Optical Mark Recognition (OMR) is used for multiple choice questions on a test. Other forms use Optical Character Recognition (OCR).
Selecting and Managing Data
Selecting the correct data is an important part of data preprocessing. This could involve only selecting data that fits a certain criteria to reduce the volume of input. Collected data can be managed using SQL to sort, restructure and select certain sections.
Exchanging Data
Exchanging data is the process of transferring the data that has been collected. One common example of this is EDI (Electronic Data Interchange).
SQL
SQL stands for Structured Query Language and is a declarative Synoptic Link language used to manipulate databases. SQL enables the creating, removing and updating of databases.