Database Fundamentals Flashcards
Data models
A data model is an abstract model of what to store and what information needs to be recorded
A relational database is a collection of tables
- Entity = An object to store data about
- Attribute = A characteristic or property of an entity
Databases are made of tables that store information about multiple entities
Keys
- Primary key = An attribute used to uniquely identify a table
- Composite key = A primary key made up of multiple attributes
- Foreign ley = A primary key from one table thats used in another table
Database normalisation
Databases are normalised so that they can be efficient without any compromise to the integrity of their data.
Normalising databases involves ensuring that entities contain no
redundant or repeated data
Benefits of normalisation:
- Easier to maintain and change a normalised database
- Data integrity is maintained
- Smaller tables allow for faster searching and sorting
First, second and third normal form
- 1NF = A database that contains no repeating attributes - the data is atomic, meaning that no single column contains more than one value
- 2NF = A database already in 1NF and with no partial key dependencies - the tables are all dependant on the whole primary key
Can only exist with composite keys - 3NF = A databse already in 2NF and has no non-key dependencies
Database relationships
- One to one = One entity is related to another entity
eg. One manager manages one football team - One to many = One entity is related to many entities
eg. One book borrower borrows multiple books from a library - Many to many = Many entities are related to multiple entities
eg. Many teams can play in multiple competitions
For one to many relationships, the foreign key is always on the many side of the relationship
For many to many relationships, a link table is needed - this is an extra table used to model the relationship between the first two tables and the first two tables with have one to many relationships with the link table
Structured Query Language
SQL is a declarative language used to query databases - a declarative language is one where the programmer says what to do and what result to get rather than how to do it
SQL commands include:
- SELECT = Used to retrieve data from a table
Syntax is:
SELECT value
FROM table
WHERE condition
ORDERBY asc/desc - this is optional - UPDATE = Used to modify values in a database table
Syntax is:
UPDATE table
SET attribute = value
WHERE attribute = value - INSERT = Used to add values into a database table
Syntax is:
INSERT INTO table
VALUES (value1, value2 - DELETE = Used to remove values from a database table
Syntax is:
DELETE FROM table
WHERE condition
Wildcards can be used to specify any value in an SQL command and are notated with an asterisk (*)
- CREATE TABLE creates a new database table
Syntax is:
CREATE TABLE tablename(Attribute1 Datatype, Attribute2 Datatype etc)
This creates an empty table; the INSERT command can be used to add attributed to the table
SQL Datatypes
Datatypes are specified with the CREATE TABLE command
- CHAR(size) = Fixed length C# string
- VARCHAR(size) = Variable length C# string
- INTEGER = C# integer
- FLOAT = C# double or float
- DATE = Date in YYYY-MM-DD formate
Client server databases
A client server database provides simultaneous access to a database for several clients
A database management system allows this to happen
DBMS server software runs on the network server and processes requests for data searches, sorts and reports that might originate from workstations running DBMS client software
Controlling concurrent access
Concurrent access is when two clients access the same record at the same time - this can lead to data being lost if they edit the record at the same time
- Record locking = When a record is accessed by one user, it is immediately locked to other users until the first user has finished using it.
Other users are blocked from accessing or modifying the content of a field until it has been unlocked - Serialisation = Requests from other users are placed in a queue.
Once the first user has finished using the field, the next command in the queue is executed and so on - Timestamp ordering = When multiple commands are sent to the same field in a
database, each is assigned a timestamp which marks the point in time at which the command was initiated.
Commands are carried out on the field in the order of their timestamps - Commitment ordering = An algorithm is used to work out an optimal order in which to execute commands for the same field.
This algorithm will take into account the impact of commands on other parts of the database and attempt to minimise issues from occurring with the database