1.3.2 - Databases Flashcards
What is a database?
Are structured and persistent stores of data for ease of processing i.e. on secondary storage, non-volatile.
-Easier to add, delete, modify and update data
-Data can be backed up and copied easier
-Multiple users, from multiple locations, can access the same database at the same time
What are fields?
A single piece of data in a record
What is a record?
A group of related fields, representing one data entry
What is a table?
A collection of records with a similar structure
What is a primary key?
A unique identifier for each record in a table. Usually, an ID number
What is a foreign key?
A field in a table that refers to the primary key in another table. Used to link tables and create relationships
What is a secondary key?
An attribute that allows a group of records in a table to be sorted and searched differently from the primary key and data to be accessed in a different order.
Basically an index that the DBMS will look for in order to find the relevant record.
What is a database management system?
Software used to manage databases. Examples include MySQL, Oracle, Microsoft SQL Server, PostgreSQL
What is a composite key?
A combination of (2 or more) fields that is unique for all records
What is a flat-file database?
These are simple data structure tables that are easy to maintain as only a limited amount of data is stored.
They are of limited use because they may have redundant and inconsistent data.
No specialist knowledge is needed to operate.
They are harder to update.
Data format is difficult to change.
What is a relational database?
Based on linked tables (relation).
Each table is based on an entity and has rows and columns.
Each row (tuple) in a table is equivalent to a record and is constructed in the same way.
Each column (attribute) is equivalent to a filed and must have just one data type.
One column or combination of columns must be the PK.
Reduces and avoids data duplication and data redundancy to save storage space.
Improves data consistency and data integrity.
Easier to change data and data format.
Data can be added more easily.
Improves levels of data security so easier to control access to data.
What is a ERD?
Entity relationship diagram, necessary when planning relational databases. Uses a diagram to show how data tables relate to each other and help with reducing redundancy (Repeated data)
What are three relationships that are used in ERDs?
One to One relationship - Makes no sense to put the data in separate tables in this case.
One to Many relationship - Used in most well-designed RDBs
Many to Many relationship - Will lead to data redundancy.
What is data redundancy?
Is unnecessary repetition of data that leads to inconsistencies
Data should have redundancy so if part of a database is lost it should be recoverable from elsewhere.
Redundancy can be provided by RAID setup or mirroring servers.
What is normalisation in databases?
A formal, methodical process to design data tables optimally.
Goes through distinct stages to lead to at least 3NF.
Resolves m-m (many to many) relationships.
Minimises repetition to reduce data redundancy.
Ensures all attributes in a table depend on one another to avoid the need to update multiple data entries when
changing a single attribute to reduce the chances of mistakes.
What is 1NF?
First Normal Form Separates out the multiple items/sets of data in each row
Following actions need to be followed:
1) Each record must never contain more than one value (EI Name cannot have a first name and last name)
2) Each row must be unique (using a primary key)
3) Each column name must be unique (cant have city 1 and city 2)
4) Must be no repeating groups (if there are then the repeated record must be put into a new table)
What is 2NF?
All data must depend on the primary key.
Removes data that occurs in multiple rows and puts these data items in a new table.
Creates relationships/links between the tables as necessary by repeated fields.
What is 3NF?
The primary key must fully define all columns and each column should not depend on any other key.
Removes non-key dependencies (i.e. transitive relationships) to their own linked table so every non-key attribute/field depends on the key, the whole key and nothing but the key!
What is indexing?
Indexing is a technique used to speed up data retrieval in a database
It works in a similar way to the index in a book
If a student had a maths book and wanted to find the section on factorising, they could start at the first page and look at each page in turn until they found the section they wanted
But this would be slow, so it is better to look in the index to find where the factorising section is and just go directly to it
Likewise in a database, certain columns can be indexed so that the DBMS does not have to look at every single record during a search and can just go to the relevant records directly
This can greatly speed up searches
Fields that are indexed are known as secondary keys
What are three different types of data files?
Serial files
Sequential files
Indexed sequential files
What are serial files?
Are relatively short and simple files.
Data records are stored chronologically i.e. in the order in which they are entered.
New data is always appended to the existing records at the end of the file.
To access a record, you to search from the first item and read each preceding item.
Easy to implement.
Adding new records is easy.
Searching is easy but slow.
What are sequential files?
Are serial files where the data in the file is ordered logically according to a key field in the record.
What are indexed sequential files
Records are sorted according to a PK
A separate index is kept that allows groups or blocks of records to be accessed directly and quickly
New records need to be inserted in the correct position and the index has to be maintained and updated to be kept
in sync with the data
Is more difficult the manage but accessing individual files is much faster
More space efficient
More suited to large files
What doe DBMSs do?
Is software that creates, maintains and handles the complexities of managing a database.
May provide a UI.
May use SQL to communicate with other programs.
Provides different views of the data for different users.
Provides security features.
Finds, adds and updates data.
Maintains indexes.
Enforces referential integrity and data integrity rules.
Manages access rights.
Provides the means to create the database structures: queries, views, tables, interfaces and outputs.
What is SQL?
Is a declarative database language that allows the creation, interrogation and alteration of a database.
What is referential integrity?
Transactions should maintain referential integrity. This means keeping a database in a consistent state so changes to data in one table must take into account data in linked tables, e.g. you cannot delete data that is linked to existing data in another table. It is often enforced by DBMS.
What are transactions?
Are changes in the state of a database:
- Addition of data
- Deletion of data
- Alteration of data
What rule must transactions follow?
ACID Rules
What are the 4 ACID rules?
Atomicity: They should either succeed or fail but never partially succeed.
Consistency: The transaction should only change the database according to the rules of the database.
Isolation: Each transaction shouldn’t affect or overwrite other transactions concurrently being processed.
Durability: Once a transaction has been started it must remain no matter what happens.
What is record locking?
Is the technique of preventing simultaneous access to objects in a database in order to prevent updates from being lost or inconsistencies in the data arising.
A record is locked whenever a user retrieves it for editing or updating.
Anyone else attempting to retrieve the same record is denied access until the transaction is completed or cancelled, e.g. if one transaction is amending a record, no other transaction can until the first transaction is complete.