Databases and SQL Flashcards
What is a data model?
An abstract description of the essential data within a given context and how data is related.
What is an entity?
A distinct thing about which data must be stored.
What is an attribute?
A single property or piece of information of an entity.
What is data modelling?
The process of producing an abstract model that represents the organisation and structure of the data
What problems exist in a flat-file database?
1) Data redundancy
2) Data inconsistency
What is an entity identifier?
An attribute or combination of attributes that uniquely identify each instance of an entity.
How is information from a flat-file database represented?
A single table with columns for each attribute and each row representing a new instance of each entity.
What is a composite key?
Multiple attributes are combined to form the table’s entity identifier
What information is stored in a entity description?
Entity name, primary key and attributes
(entity is always singular)
EntityName( PrimaryKey , Attribute1)
- primary key must be underlined
How to create a table in SQL?
CREATE TABLE table_name (
primaryAttribute DATATYPE PRIMARY KEY,
attribute2 DATATYPE NOT NULL
);
- remember that you can write NOT NULL, DEFAULT constraints
What is an entity-relationship diagram used for?
E-R diagrams are used to graphically represent the relationships between tables in a database
How can we tell from an entity description what the E-R diagram looks like?
The table which contains a foreign key of the other table holds the “many” side of the relationships.
Understanding E-R relationships
- One-to-many: one tutor, many students (and each student has one tutor)
- Many-to-many: one teacher, many pupils - one student, many teachers
- One-to-one: one school, one headteacher
What are flat-file databases?
- simple ways of storing data in a text file such as CSV
- each line is a record with fields separated by commas
What are advantages and disadvantages of flat-file databases?
+ easy to create and use for datasets
- inefficient to analyse data and gain insight and dataset grows
What is a relational database?
A collection of tables with relationships created through common attributes, namely primary and foreign keys.
What data types are supported in SQL?
1) INTEGER
2) FLOAT
3) TEXT
4) VARCHAR(x) where x is the length
5) DATETIME
6) BOOLEAN
How to insert a new record into a table in SQL?
INSERT INTO table_name
VALUES (field1val, field2val…);
How to select a record from a table in SQL?
SELECT FieldName
FROM TableName
INNER JOIN OtherTableName
ON TableName.ForeignKey = OtherTableName.PrimaryKey
WHERE Field1 = condition
ORDERBY ASC/DESC
What is the wildcard symbol and what does it do?
*
returns all fields
How to use LIKE in SQL?
For conditions:
WHERE FieldName LIKE ‘word%’
- make sure to use single quotation around word
- if % after, word starts with that
- if % before, word ends with that
- if % before and after, words contained in the text
Date range shortcut
for a given age range starting 1/1/25 ending 31/1/25 :
WHERE StartDate < #2025/01/31# AND EndDate > #2025/01/31#
say not equal to default date if needs be
What properties do fullly normalised databases have?
All non-key attributes depend upon the key, the whole key and nothing but the key (and have no repeating groups).
What are the advantages to having a fully normalised database?
1) Eliminate data redundancy
2) Eliminate data inconsistency
3) Limit unnecessary data duplication
What are the problems of data redundancy?
- increase storage costs (larger storage capacity needed)
- reduced performance (slower queries)
What issues arise with data inconsistency?
- data is no longer reliable
- can occur from update, insertion or deletion anomalies
What is database normalisation?
The process of organising data into related tables so each piece of data is only stored once.
What are many-to-many relationship indicative of?
- Lack of database normalisation
- repeating groups
What is a client server database?
A database that allows simultaneous access to a single, central database from multiple clients.
How do clients access client-server databases?
Client applications that are installed locally on users’ computers.
What are benefits of client-server databases?
- all users have access to same, up-to-date data
- data accessed from anywhere without duplicating and distributing data
- less data redundancy and hence lower risk of data integrity issues
- centrally managed data that can be mined for analysis
What are issues with client-server databases?
- data inaccessible if central server is unavailable
- simultaneous access can cause congestion and poor performance
- multiple access for same resources need to be handled
What issues occur with concurrent access?
Data integrity issues due to insertion anomalies and lost updates.
What is an insertion anomaly?
Two clients insert records simultaneously that are logically inconsistent.
What are lost updates?
Two clients or transactions access the same record to update at the same time, which can lead to the earlier update being overridden by the later update.
What is a transaction?
A single unit of work in a databases (which changes the databases when committed to the database).
How to manage the lost update problems and issues with concurrent access generally?
- Record Locks
- Serialisation:
1) Timestamp ordering
2) Transaction queueing
3) Commitment ordering
What happens in record locks?
Maintains information about which records are being accessed at any times and prevents any transactions to the record that can compromise data integrity in this time.
Read lock = no writing transactions
Write lock = no reading or writing transactions
What issues occur with record locks?
1) Deadlock: two transactions have placed locks on records that they are dependent on
2) Performance issues: other users are forced to wait
What is serialisation?
Applies transactions to a database in a logical order.
How does timestamp ordering work?
- Timestamp is associated by transaction when it begins
- Records timestamp of last successful read or write
- Applies rules before committing to database to check if data integrity will be lost
- If it will, abandon transaction
How does transaction queueing work?
- Groups updates into transaction
- Adds transaction to a processing queue
- uses FIFO for committing transactions
This can lead to reduced performance and longer waiting times
How does commitment ordering work?
- Software tracks conflicts between transactions
- Applies transactions in an order that reduces conflicts and reduces data inconsistency
- Can lead to transactions being delayed and abandoned if leads to issues