Databases and Software Development Flashcards
Database
Organised collection of data that is structured in a way to facilitate efficient retrieval, updating and analysis of data
Data model
Describes data, its structure, its relationships and constraints for a given system
Factors to consider when modelling entity relationships
- Data needed to be stored
- What real-world entities this data relates to
- Relationships between entities
- Constraints on data
Entity
Object about which data is being stored
Entity representation in databases
As tables
Attribute
Property of an entity
Attribute representation in databases
As fields/columns
Instance
Details of a particular occurence of an entity
Instance representation in databases
As records/columns
Relationship types
- One-to-one (e.g. a school has one headteacher; a headteacher only has one school)
- One-to-many
- Many-to-one
- Many-to-many (resolved by creating two one-to-many relationships)
Relational database
Type of database where:
* Separate table created for each entity
* Where relationship exists, foreign key links two tables
Entity identifier
(Collection of) attributes that uniquely identifies each instance of an entity
Primary key
Practical implementation of entity identifier
Foreign key
Attribute that creates a join between two tables (is primary key of another table)
Entity descriptions
Entity(Primary key, Attribute2 …)
Entity relationship diagrams
- Diagrammatic way of representing the relationships between the entities in a database
- Crow’s feet = many side
Problem with many-to-many relations
Storing multiple foreign keys per instance -> Unnormalised entity -> Inefficiency
Resolving many-to-many relations
- Composite/link entity created
- Each instance represents each relation by storing both foreign keys
- Create two one-to-many relations
Composite primary key
Consists of more than one attribute
Relation
Entity that is linked to other entities
Tuple
Row within relation (represents instance)
Normalisation
Process of structuring data in a relational database to reduce redundancy, increase accuracy and increase efficiency
Key features of normalisation
- No redundant data
- Each relation represents a single concept
- Data is stored at its atomic level (can’t be decomposed further)
Redundant data
Unnecessarily duplicated data
Why is redundant data bad?
- Redundant data → Wasted space and slower searching → Inefficiency
- Redundant data → ↑ Chance of incorrect copying → Inconsistent and inaccurate data
Atomic-level data
Data has been fully decomposed into multiple attributes
Levels of normal form
- First normal form (1NF)
- Second normal form (2NF)
- Third normal form (3NF)
First normal form (1NF)
Data is fully atomic and doesn’t have multiple values for any attributes (repeating groups)
Second normal form (2NF)
Separates non-key attributes that don’t relate to whole primary key (partial dependencies) into their own relations (only occurs when primary key is composite)
Third normal form (3NF)
- All attributes are dependent on the key, the whole key and nothing but the key
- Achieved by removing non-key attributes that depend on other non-key attributes from a relation via creating more relations
Advantages of normalisation
- Maintaining and modifying database
- Faster sorting and searching
- Deleting records
Maintaining and modifying the database
- Data integrity maintained (no unnecessary duplication of data) -> No possibility of inconsistencies
Faster searching and sorting
- Produces smaller tables with fewer fields -> Faster searching, sorting and indexing (less data involved)
- Holding data once -> Saves storage space
Deleting records
- Won’t allow record on ‘one’ side of one-to-many relationship to be accidentally deleted
*
Structured Query Language (SQL)
Declarative language used for querying, updating and creating tables in a relational database
SELECT statement
Returns data from listed fields where condition is met
SELECT syntax
SELECT field1, field2, etc. FROM table1, table2, etc. WHERE condition ORDER BY field1, field2, etc.
SELECT from multiple tables
SELECT Song.SongTitle, Artist.ArtistName, Song.MusicType FROM Song, Artist WHERE (Song.ArtistID = Artist.ArtistID) AND (Song.MusicType = "Art Pop")
OR
~~~
SELECT Song.SongTitle, Artist.ArtistName, Song.MusicType
FROM Song
WHERE Song.MuscType = “Art Pop”
JOIN Artist ON Song.ArtistID = Artist.ArtistID
~~~
Not equal to
<>
IN
Equal to value within set of values
LIKE
Equality when using wildcards
BETWEEN x AND y
Equal to value within set of values defined by limits x and y
IS NULL
Field doesn’t contain a value
%
- Represents zero or more characters
- E.g. bl% finds black, blue, etc.
_
- Represents a single character
- E.g. b_t finds bat, but, etc.
[]
- Represents any single character within brackets
- b[au]t finds only bat and but
[^]
- Represents any character not in brackets
- E.g. b[^au]t finds any combination except bat and but
-
- Represents any character within brackets specified by range
- E.g. b[a-u]t finds any combination of letters from a to u
Data types
- CHAR(n) (fixed length string)
- VARCHAR(n) (variable length string max size n)
- …
- DATE (stores day, month and year values)
- TIME (stores hour, minute and second values)
- CURRENCY (formats numbers into currency format of current region)
CREATE TABLE syntax
CREATE TABLE Employee ( EmpID INTEGER NOT NULL PRIMARY KEY, EmpName VARCHAR(20) NOT NULL, HireDate DATE, Salary CURRENCY )
Defining linked tables
FOREIGN KEY CourseID REFERENCES Course(CourseID)
Add field
ALTER TABLE Employee ADD Department VARCHAR(10)
Delete a field
ALTER TABLE Employee DROP COLUMN HireDate
Modify field
ALTER TABLE Employee MODIFY COLUMN EmpName VARCHAR(30) NOT NULL
UPDATE statement
Updates record in a table
UPDATE syntax
UPDATE table SET column1 = value1, column2 = value2, ... WHERE columnX = value
DELETE statement
Deletes a record from a table
DELETE syntax
DELETE FROM table WHERE columnX = value
Aggregate functions
Used in conjunction with SELECT statement (replaces fields)
Aggregate function examples
- MIN(field)
- MAX(field)
- COUNT(field)
- SUM(field)
GROUP BY
- Separates output values based on the record’s value in the given field
- Displays aggregate function values clearer
Client-server application model
- Consists of central server (abstraction) and multiple clients connecting to it
- Core data and services stored and provided server-side
- Interaction with data via lighter, ‘simpler’ client-side applications
Advantages of client-server model
- Data stored in one place -> Consistency of database
- Data easily shareable across multiple devices
- Data, backup and recovery can be centrally managed
- Central security management
Disadvantages of client-server model
- If server goes down, risk of data becoming lost or inaccessible
- Multiple users accessing server simultaneously -> Congestion and poor performance
- Requires conflict management
Relational Database Management System (RDBMS)
Provides client-server support, including handling concurrent access
Lost update problem
When two users attempt to update the same record simultaneously -> One update being lost or both being applied incorrectly
Record locks
- Lock applied to record when a transaction (read/write) on it is started
- Prevents other users from reading from or writing to it
Problem with record locks
Deadlock - where two users attempt to access the same record or access a pair of records such that they are both locked out
Serialisation
Method of ordering transactions into a queue
Timestamp ordering
- Each transaction’s timestamp is recorded in DB
- To prevent timestamps being lost, every object has read and write timestamps
- T_w should be aborted if read/write timestamp on record is greater (more recent)
- T_r should be aborted if write timestamp on record is greater
Commitment
Process of writing changes to a database permanently
Commitment ordering
- Extends timestamp ordering
- Concurrent transactions commited in particular order -> Avoids data update issues and risk of deadlock
- Order of commits determined by interaction of commitments & dependencies on common data