Databases Flashcards
Data inconsistency
Data held in more than one place in the database that could be changed in one place but not in the other
Can be a waste of disk space
Data redundancy
Data held in more than one place in the database
The data doesn’t need to be held more than once so the data becomes redundant
Wastes processing time to update different copies
Data independence
Data acts as a repository or pool of data, separate from the applications that deal with it
Keeping the data separate from the applications that use it is known as data independence
Can create applications without changing the way data is stored
Flat file databases
Very basic systems that can have multiple tables but they aren’t linked
Useful if simple information is being recorded
What are the potential problems with flat file databases
Potential inconsistency
Potential redundancy (wasting space as same information stored more than once
Takes longer to enter information as it has to be entered more than once
Potential loss of data
What are the advantages of relational databases over flat file
A relational database allows data to be held in multiple tables that can have relationships between them
A well designed (normalised) database will not gave any of the problems associated with flat file databases (redundancy, inconsistency, etc.)
By splitting the single table into multiple tables it is possible to hold the exact same information without the problems of a flat file (redundancy, inconsistency etc.)
Define a primary key
A field in a table which uniquely identifies a record, e.g. a pin uniquely identifies a student
Composite primary key
More than one field is needed to uniquely identify a record
Surrogate primary key
A record identifier automatically generated by the database management system e.g. CustomerID, SalesID, etc.
Foreign key
A field in a table that is a primary key in another table used to form a link between the two
Indexes
Allows searches to be performed much more quickly
A table can have several indexes if required
Downsides of indexes
More memory is required to hold each index and every time a record is added or deleted the index needs to be updated which can add a slight delay
Therefore only fields which are regularly used in searches should be indexed
Three types of relationship in ERD
Many to many -> create a new entity in the middle to make two one to Many relationships
Many to one
One to many
Rule of ERDS
If you have a one to many relationship, copy the primary key from the entity at the ‘one’ end of the relationship to the ‘many’ end to become the foreign keys
Data normalisation
A database design technique which tries to ensure that a database in designed correctly, avoiding the various problems we have seen with inconsistency and redundancy
Define first normal form
No repeating attributes
Define second normal form
If its already in first normal form
And each field is dependent on the whole primary key (has no partial key dependencies)
Define third normal form
If its already in first and second
Contains no non-key dependencies
If in third normal form the database is normalised
What is a database management system
An application program that provides an interface between the OS of the computer and the user in order to make access to the data as simple as possible
What are the different functions of a database management system
Data storage, retrieval and update - must allow users to store, retrieve and update information as easily as possible, without having to know much about the internal structure of the databse
Allows access to queries and reports
Creation and maintainance of the data dictionary
Managing the facilities for sharing the database
Backup and recovery - the DBMS must provide the ability to recover the database in the event of failure
Security- The DBMS must handle password allocation and checking, and the view of the database that a given user is allowed
What are the different functions of a database management system
Data storage, retrieval and update - must allow users to store, retrieve and update information we easily as possible, without having to know much about the internal structure of the databse
What are the different functions of a database management system
Data storage, retrieval and update - must allow users to store, retrieve and update information we easily as possible, without having to know much about the internal structure of the databse
Big data
Used to describe large collections of data (similar to data warehouse)
But is too large or complex to be processed using standard database techniques
E.g. Netflix data analytics of over 200 million subscribers
Distributed databases
Databases in which the data is stored across two or more computer systems located at different sites on a computer network, perhaps over a wide geographical area
Benefits of distributed databases
By having the processing spread across different systems it can maximise performance and also there will be less data required to pass across network/Internet links so data can be recorded/read more quickly
Concerns about distributed databases
Distributed databases need to be powerful systems that can cope with the synchronisation of the data between the branches to ensure that data inconsistency does not occur and also to ensure data is kept up to date where this is a requirement
Describe cloud services
Increasingly, organisations will run their database systems in the servers in data centres owned by very large cloud providers such as Amazon Web Services and MIcrosoft. This means that organisations don’t need to invest as much more money/time/staffing in their own data centres and use the cloud services to deal with issues such as security, upgrades, backups etc
These services can automatically increase processing power on demand