Databases Flashcards
Characteristics of a database
Data is logically related (entities with attributes and relations to other entities)
Self-describing (schema/metadata)
Data abstraction (separation of internal and external representation)
3-level way of representing a database management system
External level: the user’s view
Internal level: The technical part
Conceptual level: How the two interact
Advantages and disadvantages of using a DBMS
Advantages: control over data redundancy, consistency, standards, improved accessibility, improved accessibility and productivity (can extract more information from the same amount of data), improved maintenance, scaling
Disadvantages: complexity, size, performance (sometimes worse than single-purpose systems), single point of failure
How does the relational data model work
A relation is a table with rows and columns
A column is called an attribute (or field)
A row is called a record
Two tables are often related by a third table (e.g. Students and Courses are related by a StudentCourses table)
What is Entity-Relationship Modelling?
A way of communicating databases graphically: non-technical, free of ambiguities
Examples include crow’s feet notation and UML
Specifies attributes, relationships, constraints
What is normalisation?
The process of reducing data redundancy and minimising the number of attributes across all tables. This helps avoid update anomalies and saves space.
Creating a table with SQL
CREATE TABLE Staff (staffNo VARCHAR(5), LName VARCHAR(15), salary DECIMAL(7,2));
Inserting records into a table with SQL
INSERT INTO Staff VALUES (“SG16”, “Brown”, 8300);
Querying a table with SQL
SELECT staffNo, LName, Salary
FROM Staff
WHERE salary > 10000;
Purpose of three-level architecture
Realises data abstraction: Users can access same data but have different customised views
Users can change views without affecting other users
Internal structure can be changed without affecting users
External level
Users’ view of the database
Multiple different views of the same data (e.g. 20 Jan ‘22 vs 2022-01-20)
Adapted to specific needs
Different entities, attributes, relationships
Possibly derived/calculated/combined (e.g. getting age from date of birth)
Conceptual level
Community view of the database, shared by all users
Describes what data is stored (entities, attributes, relationships, constraints, security/integrity information)
Internal level
Physical representation of the DB
Describes how the data is stored to achieve optimal runtime performance and storage utilisation
Interface to operating system
Compression and encryption
DB schema
The description of the database, the result of the design process
Should not change
Multiple external subschemata (different views)
One conceptual schema (all entitie/attritutes/relationships)
One internal schema (low-level description, storage, indexes etc)
DB instance
The data in the database at a particular point in time, changes whenever data is edited
Logical data independence
External schemata remain the same if we change the logical structure on the conceptual level
Physical data independence
Conceptual schema remains the same if we change the internal schema (data structures, algorithms)
Conceptual design phase
Construct a first, high-level model of the data
Uses entity-relationship modelling
Identify the entities, attributes, relationships and constraints
Based on users’ requirements
Independent of any physical considerations
Logical design phase
Construct the relational data model (the actual tables, e.g. Students, Courses, StudentCourses)
Normalisation takes place
Physical design
Describe the implementation of the logical design (storage structures, access methods, security)
Optimise for performance
Relation
Table with rows and columns (e.g. Student)
Attribute
Named column in a table (e.g. surname)
Field
Tuple
A row in the table (e.g. S002, Jane, Doe)
Record
Domain of an attribute
Set of allowed values (e.g. positive integers, strings starting with capital letter)
Cell
Value of specific attribute in tuple
Degree
Number of attributes/columns
Cardinality
Number of tuples (rows)
When is a relation normalised?
Every cell has exactly one value, no repetition of identical tuples (rows)
NULL
Absence of a value
Represents an attribute value that is currently unknown or not applicable
Has to be handled in a special way
What is a key
A set of attributes whose values uniquely determine a tuple
Candidate key: Candidate Key is a collection of attributes or a single attribute (single column) that has the ability to uniquely identify records in the table (fully functionally determines all other attributes)
Primary key: Like a candidate key but there’s only one of them
Alternate key: Any candidate key which isn’t the primary key
Simple key: A key that consists of a single attribute
Composite key: A key that consists of multiple attributes
Foreign key
A set of attributes in a table that refers to the primary key of another table (e.g. mentorNo in Student)
Integrity constraints
Domain constraints: attribute values have to lie within their respective domain
Entity integrity: Attributes of the primary key cannot be NULL
Referential integrity: Foreign keys must match some primary key or be NULL
Prevents deleting a tuple if it’s referenced somewhere else
What is a view?
A virtual relation (not stored physically)
Derived from one or more base relations
Computed upon request
Main use: showing customised information, with dynamic quantities (age from date of birth)
ER modelling vs Relational Data Model
ER modelling: Object-based, visual representation, used to understand data requirements, high-level and non-technical, intuitive
Relational Data Model: Record-based, mathematically sound, used to define database schema, low-level technical, not very intuitive
Entity in ER modelling
A thing that should be explicitly represented (e.g. a student, a course, a teacher), represented as a labelled rectangle
Relationships in ER modelling
A named association between two or more entity types (e.g. student attends course, person is a citizen of country, represented as a labelled line, triangle represents reading direction
Constraints in ER modelling
min..max
* = no constraint
“Teacher 1..1 is head of 0..1 School” means that a teacher can be head of 0-1 schools, and all schools must have 1 head
Cardinality of a..b <assoc.> c..d is b:d
1:1 = one-to-one, 1:* = one-to-many, *.* = many-to-many. a and c indicate participation</assoc.>
Attributes in ER modelling
Properties of an entity type
Represented as the lower part of an entity rectangle, primary key underlined
Names use camelCaseNotation
Types of attribute
Simple: one component
Composite: multiple components, indented
Single-valued: Only one value for each entity (e.g. name)
Multi-valued: multiple values for each entity (e.g. phone number), range given as [min..max]
Derived: Computed from other attributes (e.g. age), prefixed with “/”
How to turn an ER model into a Relational Data Model
1:1 relationship: Add foreign key to (either) mandatory entity, if both are mandatory then combine into one relation
1:* relationship: Add foreign key to “many” side (e.g. teacherId as foreign key in Course)