Databases Flashcards

1
Q

Characteristics of a database

A

Data is logically related (entities with attributes and relations to other entities)
Self-describing (schema/metadata)
Data abstraction (separation of internal and external representation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

3-level way of representing a database management system

A

External level: the user’s view
Internal level: The technical part
Conceptual level: How the two interact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Advantages and disadvantages of using a DBMS

A

Advantages: control over data redundancy, consistency, standards, improved accessibility, improved accessibility and productivity (can extract more information from the same amount of data), improved maintenance, scaling

Disadvantages: complexity, size, performance (sometimes worse than single-purpose systems), single point of failure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How does the relational data model work

A

A relation is a table with rows and columns
A column is called an attribute (or field)
A row is called a record

Two tables are often related by a third table (e.g. Students and Courses are related by a StudentCourses table)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Entity-Relationship Modelling?

A

A way of communicating databases graphically: non-technical, free of ambiguities
Examples include crow’s feet notation and UML
Specifies attributes, relationships, constraints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is normalisation?

A

The process of reducing data redundancy and minimising the number of attributes across all tables. This helps avoid update anomalies and saves space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Creating a table with SQL

A

CREATE TABLE Staff (staffNo VARCHAR(5), LName VARCHAR(15), salary DECIMAL(7,2));

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Inserting records into a table with SQL

A

INSERT INTO Staff VALUES (“SG16”, “Brown”, 8300);

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Querying a table with SQL

A

SELECT staffNo, LName, Salary
FROM Staff
WHERE salary > 10000;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Purpose of three-level architecture

A

Realises data abstraction: Users can access same data but have different customised views
Users can change views without affecting other users
Internal structure can be changed without affecting users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

External level

A

Users’ view of the database
Multiple different views of the same data (e.g. 20 Jan ‘22 vs 2022-01-20)
Adapted to specific needs
Different entities, attributes, relationships
Possibly derived/calculated/combined (e.g. getting age from date of birth)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Conceptual level

A

Community view of the database, shared by all users
Describes what data is stored (entities, attributes, relationships, constraints, security/integrity information)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Internal level

A

Physical representation of the DB
Describes how the data is stored to achieve optimal runtime performance and storage utilisation
Interface to operating system
Compression and encryption

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DB schema

A

The description of the database, the result of the design process
Should not change
Multiple external subschemata (different views)
One conceptual schema (all entitie/attritutes/relationships)
One internal schema (low-level description, storage, indexes etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DB instance

A

The data in the database at a particular point in time, changes whenever data is edited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Logical data independence

A

External schemata remain the same if we change the logical structure on the conceptual level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Physical data independence

A

Conceptual schema remains the same if we change the internal schema (data structures, algorithms)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Conceptual design phase

A

Construct a first, high-level model of the data
Uses entity-relationship modelling
Identify the entities, attributes, relationships and constraints
Based on users’ requirements
Independent of any physical considerations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Logical design phase

A

Construct the relational data model (the actual tables, e.g. Students, Courses, StudentCourses)
Normalisation takes place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Physical design

A

Describe the implementation of the logical design (storage structures, access methods, security)
Optimise for performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Relation

A

Table with rows and columns (e.g. Student)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Attribute

A

Named column in a table (e.g. surname)
Field

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Tuple

A

A row in the table (e.g. S002, Jane, Doe)
Record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Domain of an attribute

A

Set of allowed values (e.g. positive integers, strings starting with capital letter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Cell
Value of specific attribute in tuple
26
Degree
Number of attributes/columns
27
Cardinality
Number of tuples (rows)
28
When is a relation normalised?
Every cell has exactly one value, no repetition of identical tuples (rows)
29
NULL
Absence of a value Represents an attribute value that is currently unknown or not applicable Has to be handled in a special way
30
What is a key
A set of attributes whose values uniquely determine a tuple Candidate key: Candidate Key is a collection of attributes or a single attribute (single column) that has the ability to uniquely identify records in the table (fully functionally determines all other attributes) Primary key: Like a candidate key but there's only one of them Alternate key: Any candidate key which isn't the primary key Simple key: A key that consists of a single attribute Composite key: A key that consists of multiple attributes
31
Foreign key
A set of attributes in a table that refers to the primary key of another table (e.g. mentorNo in Student)
32
Integrity constraints
Domain constraints: attribute values have to lie within their respective domain Entity integrity: Attributes of the primary key cannot be NULL Referential integrity: Foreign keys must match some primary key or be NULL Prevents deleting a tuple if it's referenced somewhere else
33
What is a view?
A virtual relation (not stored physically) Derived from one or more base relations Computed upon request Main use: showing customised information, with dynamic quantities (age from date of birth)
34
ER modelling vs Relational Data Model
ER modelling: Object-based, visual representation, used to understand data requirements, high-level and non-technical, intuitive Relational Data Model: Record-based, mathematically sound, used to define database schema, low-level technical, not very intuitive
35
Entity in ER modelling
A thing that should be explicitly represented (e.g. a student, a course, a teacher), represented as a labelled rectangle
36
Relationships in ER modelling
A named association between two or more entity types (e.g. student _attends_ course, person _is a citizen of_ country, represented as a labelled line, triangle represents reading direction
37
Constraints in ER modelling
min..max * = no constraint "Teacher 1..1 is head of 0..1 School" means that a teacher can be head of 0-1 schools, and all schools must have 1 head Cardinality of a..b c..d is b:d 1:1 = one-to-one, 1:* = one-to-many, *.* = many-to-many. a and c indicate participation
38
Attributes in ER modelling
Properties of an entity type Represented as the lower part of an entity rectangle, primary key underlined Names use camelCaseNotation
39
Types of attribute
Simple: one component Composite: multiple components, indented Single-valued: Only one value for each entity (e.g. name) Multi-valued: multiple values for each entity (e.g. phone number), range given as [min..max] Derived: Computed from other attributes (e.g. age), prefixed with "/"
40
How to turn an ER model into a Relational Data Model
1:1 relationship: Add foreign key to (either) mandatory entity, if both are mandatory then combine into one relation 1:* relationship: Add foreign key to "many" side (e.g. teacherId as foreign key in Course)
41
Handling many-to-many relationships (*:*)
Create separate relation with two foreign keys (e.g. Attendance relation containing studentId and courseId). Connect it with two one-to-many relationships. Entity1 A:B action C:D Entity2 => Entity1 1:1 action C:D EntityLink A:B 1:1 Entity2
42
Steps for constructing an ER model
1. Identify entities and relationships from real world scenario 2. Draw entities and connect them 3. Add constraints 4. Resolve many-to-many relationships 5. Add attributes and define primary keys
43
Top-down vs Bottom-up approach to designing databases
Top-down: Entity-Relationship model Start from data requirements, use a graphical description of the database, identify entities in the real world Bottom-up: Start with examples of actual data in tables, analyse relationships between attributes and redesign the tables in a better way
44
What does it mean to redesign tables in a "better way"?
No redundancy: Wastes storage space, may lead to inconsistencies
45
Examples of data redundancy
Multi-valued attributes which need to be stored in separate rows Dependencies (e.g. collegeAddress is dependent on collegeName)
46
Examples of anomalies
Insertion anomaly: New records need to duplicate existing data Modification anomaly: Modifying only one value leads to inconsistencies Deletion anomaly: Deleting one piece of data and losing other data in the process
47
Decomposing a table
Avoids modification anomalies Student(id, forename, surname, collegeName, collegeAddress) => Student(id, forename, surname, college) + College(college, collegeAddress) college in Student is a foreign key
48
Functional dependencies
If A and B are two sets of attributes in a relation, and each value of A is associated with exactly one value of B, then B is functionally dependent on A and A determines B (A => B) Informally this means that if you know the value(s) of A then you know the value(s) of B, and A is the determinant of B
49
Types of functional dependency
Full functional dependency: B is functionally dependent on A and not on any smaller subset of A Partial functional dependency: B is functionally dependent on A and at least one smaller subset of A Transitive functional dependency: If A => B and B => C then C is transitively dependent on A via B
50
Closure of a set of dependencies
Given a set of functional dependencies F, its closure F+ is all the functional dependencies that are implied by the dependencies in F
51
Armstrong's axioms for computing F+ from F
Reflexivity: if B is a subset of A, then A => B (id, forename, surname) => (forename, surname) Augmentation: if A => B, then A+C => B+C (id) => (college) therefore (id, forename) => (college, forename) Transitivity: if A => B and B => C, then A => C (id) => (college) and (college) => (collegeAddress) therefore (id) => (collegeAddress) (De)composition: if A => B,C then A => B and A => C and vice versa Composition: if A=>B and C=>D, then A,C => B,D
52
Finding the closure F+ of F
Start with all dependencies in F Repeatedly apply Armstrong's axioms Add any new dependencies discovered along the way
53
1st Normal Form
No repeating groups: No attributes or groups of attributes with multiple values for a single occurrence of the primary key (i.e. multi-valued variable) If there is a fixed number of repetitions, you can use separate attributes (e.g. phoneType + phoneNumber => mobilePhone + workPhone)
54
General solution for converting to 1st normal form
Flattening: Fill empty cells with repeating values Create separate relation
55
2nd Normal Form (only relevant for relations with composite keys)
A table is in 2nd Normal Form if it's in 1NF and has no partial functional dependencies: every non-key attribute is dependent on the whole primary key (no partially dependent attributes)
56
Converting to 2nd normal form
Remove the partially dependent attributes and put then in a new relation along with a copy of their determinant
57
3rd Normal Form
No transitive dependencies, meaning every dependency must be fully/directly dependent on the primary key(s)
58
Converting to 3rd Normal Form
Remove the transitively dependent attributes and put them in a new relation along with a copy of their determinant
59
Two components of SQL
Data Definition Language: Defines the schema for each relation, defines the domain of each attribute, specifies integrity constraints (e.g. primary/foreign keys) Data Manipulation Language: Inserts/updates/deletes/retrieves data from the database. The query language is the part of the DML that involves retrieval
60
SELECT Statement syntax
SELECT (DISTINCT) FROM Optional: WHERE conditions GROUP BY columnList HAVING conditions ORDER BY columnList (ASC/DESC)
61
SELECT Statement
FROM = specifies the table(s) to be used WHERE = filters the rows according to a condition GROUP BY = forms groups of rows with the same column value HAVING = filters groups according to a condition SELECT = specifies which columns should appear in the output ORDER BY = specifies order of output "SELECT DISTINCT salary" removes all but one of the rows which share the same salary
62
SELECT example
SELECT staffNo, fName, lName, europoorSalary/12 AS monthlySalary FROM Staff WHERE europoorSalary/12 < 5000 (can't reuse monthlySalary alias)
63
SELECT example with derived query
SELECT staffNo, fName, lName, salary AS yearlySalary, (SELECT yearlySalary)/12 AS monthlySalary FROM Staff WHERE salary/12< 1500 (SELECT yearlySalary) is a derived query that allows the alias to be reused
64
Constructing conditions
=, <, >, <=, =>, <> (not equal) IN (setItem1, setItem2, ...) LIKE 'pattern' IS (NOT) NULL x BETWEEN y AND z (inclusive) AND, OR, NOT
65
Using the IN operator
SELECT fName, lName, salary, position FROM Staff WHERE position IN ('Manager', 'Supervisor') The last line is equivalent to: WHERE position='Manager' OR position='Supervisor'
66
Using the LIKE operator (pattern matching)
SQL has two special pattern-matching symbols: % (equivalent to * in cmd.exe), and _ (equivalent to a single arbitrary character) LIKE 'H%' means that the first character is H, but the rest can be anything LIKE 'H___' means there are 4 characters and the first is H NOT LIKE '%e' means the last character is not "e" Example: SELECT fName, lName FROM Staff WHERE fName LIKE 'J%'
67
Combining conditions
SELECT fName, lName, position, salary FROM Staff WHERE position NOT IN ('Manager', 'Supervisor') AND (fName LIKE 'J%' OR salary > 10000)
68
ORDER BY example
SELECT fName, lName, gender FROM Staff ORDER BY gender ASC, fName DESC
69
Aggregate functions
Operates on a single column and returns a single value: SUM(column) AVG(column) MIN(column) MAX(column) COUNT(column) (excludes NULL) COUNT(*) (number of rows in a table, including NULL) Example: SELECT SUM(salary)/COUNT(salary) FROM Staff
70
GROUP BY clause
GROUP BY is used in combination with aggregate functions It groups rows by one or more attributes and then applies aggregate functions to each group separately Example: SELECT position, AVG(salary) FROM Staff GROUP BY position ORDER BY AVG(salary) DESC The "GROUP BY position" makes the AVG(salary) column display AVG(SELECT salary FROM Staff WHERE position='BBCbullorwhatever')
71
Inner join
a JOIN b ON Merges a and b and excludes all rows which do not fulfill SELECT fName, lName, street FROM Staff JOIN Branch ON Staff.branchNo = Branch.branchNo Lists the names of staff and the street of the branch they work at (merges Staff and Branch and excludes all rows where the branchNo values are not the same)
72
Inner join example with aliases
SELECT A.fName, A.lName, A.salary, B.fName, B.lName, B.salary FROM Staff A JOIN Staff B ON B.salary <= A.salary and B.staffNo <> A.staffNo WHERE A.fName LIKE 'A%'
73
Outer join
a LEFT JOIN b ON: Includes missing records from the left table a RIGHT JOIN b ON: Includes missing records from the right table a FULL JOIN b ON: Includes missing records from both tables
74
Subqueries
Allow you to reuse the result of one query in another query SELECT staffNo, fName, lName, position FROM Staff WHERE branchNo = ( SELECT branchNo FROM Branch WHERE street = '163 Main St' ) The inner select only returns a single value ('B003') so this is a single-value subquery
75
Multi-valued subquery
SELECT staffNo, fName, lName, position, salary FROM Staff WHERE salary > SOME (SELECT salary FROM Staff WHERE branchNo = 'B003') This displays all staff whose salary is higher than the salary of at least one member of staff at branch B003 SOME can also be replaced with "ALL"
76
INSERT syntax
INSERT INTO TableName [(columnList)] VALUES (valueList) Example: INSERT INTO Staff(staffNo, fName, lName, salary) VALUES ('SG16', 'Alan', 'Brown', 10000)
77
UPDATE syntax
UPDATE TableName SET columnName=dataValue WHERE Example: UPDATE Staff SET salary = salary*1.03
78
DELETE syntax
DELETE FROM TableName WHERE Example: DELETE FROM Staff WHERE branchNo='B003' Deleting a whole table is done with DROP TABLE Staff
79
Domain types in SQL
CHAR(n): character string of fixed length n VARCHAR(n): character string of variable length at most n BIT(n): bit string of fixed length n INTEGER SMALLINT: -32768 <= x < 32768 NUMERIC(p,d): A decimal number with at most p digits and d decimal digits
80
Creating a domain type
CREATE DOMAIN Postcode AS CHECK ()
81
CREATE TABLE syntax
CREATE TABLE TableName(Attribute Domain [integrity and referential constraints] Constraints: PRIMARY KEY (keyName) FOREIGN KEY (keyName) REFERENCES (otherTable) Specifying what to do with a foreign key when the corresponding primary key is updated/deleted: ON (UPDATE/DELETE) (CASCADE/SET NULL/SET DEFAULT/NO ACTION) CASCADE = update foreign key/delete tuple
82
DROP TABLE syntax
DROP TABLE TableName RESTRICT: The drop is rejected if there are other objects whose existence depends on TableName DROP TABLE TableName CASCADE: The drop is forced and all dependent objects are dropped recursively (default option)
83
ALTER TABLE syntax
Changes the schema of a relation Add new attributes: ALTER TABLE Person ADD salary INTEGER Remove attributes: ALTER TABLE Person DROP married CASCADE Change attribute characteristics: ALTER TABLE Person ALTER gender SET DEFAULT 'F'
84
Defining views
A view is a table that is derived from one or more other tables and is computed on demand Example: CREATE VIEW Developers AS SELECT name, project FROM Employee WHERE department = 'Development'
85
Types of database language
Declarative (like SQL): Specifies what data to retrieve Procedural: Specifies how to retrieve data
86
Relational calculus and algebra
Relations are considered as sets of elements. An element is a tuple of attribute values, and each row in a table specifies one element of the set Relational calculus (declarative): Uses set theoretic expressions (∀x and ∃y for example) to define new sets based on existing ones Relational algebra (procedural): Uses operators to construct a new set from one or more existing ones These two can be used to define/create the same sets. For every calculus expression there are one or more algebraic expressions.
87
Relational calculus
Predicate: A function that returns true/false when arguments are given Proposition: An expression made of up of AND, OR, NOT etc. Example: {x|P(x) ^ Q(x)} means the set of all x such that both P(x) AND Q(x) are true OR = v, NOT = ¬
88
Tuple relational calculus
Variables' values are tuples (entries) To specify that tuple s should be in relation Staff, use as a predicate: Staff(s) Use s.element to access named elements in tuples ∀ = for all, ∃ = there exists {t.A|R(t) ^ t.B > 0} means "Select all A from relation R(A, B) for which the value of B is greater than 0"
89
Domain relational calculus
Variables' values are attribute values (cells) Example: {a | (∃b)(R(a, b) ∧ b > 0)} means select all A from the relation R(a, b) for which the value of B is greater than 0
90
Relational algebra
Procedural: specifies operations to construct a new set from existing ones Unary operations: selection, projection, renaming Binary operations: union, set difference, cartesian product
91
Selection in relational algebra
σ_condition_(R) SQL equivalent: SELECT * FROM R WHERE
92
Projection in relational algebra
Π_col1, col2, ..._(R) Selects the specified subset of attributes/columns from R (without duplicates) SQL equivalent: SELECT DISTINCT col1, col2, ..., FROM R
93
Union in relational algebra
A ∪ B Selects tuples that are in A or B SQL equivalent: SELECT ... UNION SELECT ... Example: Π_city_(Branch) ∪ Π_city_(Property) This lists all cities where there is either a branch or a property
94
Set difference in relational algebra
A - B Selects tuples that are in A but not in B SQL equivalent: SELECT ... MINUS/EXCEPT SELECT ... Example: Π_city_(Property) – Π_city_(Branch) This lists all cities where there is a property but not a branch
95
Intersection
A ∩ B Selects tuples that are in A and in B SQL equivalent: SELECT ... INTERSECT SELECT ... Equivalent to A - (A - B) or B - (B - A)
96
Union compatibility
To compute A (∪/∩/-) B, the schemata of A and B must match: same number of attributes and each pair of matching attributes must have the same domain
97
Cartesian product in relational algebra
A x B Generates all possible combinations of tuples in A and B by concatenating them, resulting in Ra * Rb tuples with Ca + Cb attributes No compatibility requirements Attributes with the same name are prefixed with the relation name Corresponding SQL: SELECT * FROM A, B or SELECT * FROM A CROSS JOIN B
98
Rename in relational algebra
ρ_Y(A, B, C, ...)_(X) returns X renamed as Y, with attributes renamed to A, B, C, ...
99
Division in relational algebra
A ÷ B Only keeps tuples from A that exist in all possible combinations with B Find combinations of A x B that are not in A Project to attributes that are only in A Remove them from A
100
Joining in relational algebra
A [INNER] JOIN B ON A_⨝condition_B Equijoin: contains an = sign Natural join: equijoin over all common attributes Semijoin: inner join and project to attributes of the left relation LEFT OUTER JOIN = ⟕ RIGHT OUTER JOIN = ⟖ FULL OUTER JOIN = ⟗