DataBases Flashcards

1
Q

Definition of a database

A

an organised collection of
related data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Definition of DBMS

A

Database Management System (DBMS):
software that manages & controls access to
the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a database application?

A

Database Application: program that interacts
with the database at some point in its
execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a database system?

A

Database System: a collection of programs that interact with the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define a file based & database system and the difference between them

A

File-based system: A collection of application programs that perform services for the end-users. Each program defines and manages its own data.
* Database system: A shared collection of logically related data and its description, designed to meet the information needs of an organisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are advantages and disadvantages of a file-based approach?

A

Advantages:
* User control:
― End user has total control of the stored data
― Can modify application or data at will
* Applications and data file can be optimised for particular
tasks

Disadvantages:
* Data Duplication
* Data Dependence
* Incompatible file formats
* Proliferation of application programs (APPS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define data duplication

A

A record is duplicated in > 1 files
― A change in details requires updates to all relevant files
* Integrity of records may be compromised if the files are not all updated
― i.e. Which file holds the correct information?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define data dependence and the two types

A

Definition:

Data dependence: Refers to the relationship between database operations based on their reliance on each other’s results.
Types:

RAW Dependence (Read-after-Write):

Operation depends on the result of a preceding write operation.
Example: Retrieving data after it has been updated.
WAR Dependence (Write-after-Read):

Write operation depends on the result of a preceding read operation.
Example: Updating data based on a prior retrieval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the proliferation of apps and name a problem that occurs with it

A

In time, users might want to get more information from their data files than what they originally anticipated when they initially wrote the programs
* To do so they would need to add more application programs to handle the new queries
* This leads to a proliferation of files and application programs which each user has to handle

This can be addressed via the use of DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Name 5 advantages of using the DBMS

A
  • Control of data redundancy
  • Data Consistency
  • Sharing of Data
  • Improved data integrity
  • Improved Security
  • Enforcement of Standards
  • Economy of Scale
  • Balance of conflicting requirements
  • Improved data accessibility and responsiveness
  • Increased Productivity
  • Improved maintenance through data independence
  • Increased concurrency
  • Backup and Recovery services
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

explain how the DBMS helps with the economy of scale

A
  • Combining the entire organisation’s data into one database and
    creating a set of applications that work on this one source
    results in cost savings
    ― For instance, only one information system department and one set of computer systems in the organisation will be responsible for the
    database as opposed to separate information systems in the different departments
    ― This is much cheaper than having many separate small departmental information systems units
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

explain how the DBMS helps with improved maintenance through data
independence

A
  • A DBMS separates data management from the application
    programs
    ― Hence if changes are made to the underlying data structure, the
    DBMS can handle this without changing application programs
  • The DBMS will provide the necessary mapping between
    application program and data stored on the database (using
    system catalogue)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are some disadvantages of DBMS (6 total)

A
  • Complexity
    ― DBMSs are complex pieces of software and hence require those
    operating and using them to fully understand them if they are to be used to the best advantage
  • Size
    ― The complexity and breadth of functionality of the DBMS makes it a large piece of software that requires a large amount of memory to run efficiently
  • Cost of DBMS
    ― DBMS cost varies depending on whether it is for a single user or a big organisation. Maintenance and operations costs also need to be factored in. For large organisations with lots of data, hardware costs (e.g. hard disks, high performance processors, memory) are
    considerable
  • Cost of conversion
    ― When converting from a file-based system to a DBMS, there are data conversion costs, hardware and software procurement costs, as well as costs of training and recruiting personnel
  • Performance
    ― The DBMS is written for general use. Hence some applications may not run as fast as customised file-based applications
  • Greater impact of failure
    ― Centralisation increases system vulnerability. Failure of the DBMS may result in everyone being affected within an organisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the three levels present in three level database architecture(ANSI-SPARC)

A

External level
Conceptual level
Internal level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the External level of databases

A

It deals with how data is presented to and accessed by end users or applications.
Users at this level define their individual views of the database, specifying what data they need and how they want it to be displayed.
Multiple external views can exist, tailored to the specific needs and requirements of different user groups or applications.
Changes to the external level, such as modifications to views or access patterns, do not affect the underlying database structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the conceptual level

A

The conceptual level represents the abstract, logical structure of the entire database.
It defines the relationships among data elements, the constraints, and the integrity rules that must be maintained.
The schema at this level provides a comprehensive, unified view of the entire database, independent of any specific user’s perspective.
Modifications at the conceptual level, such as changes to the data model or structure, impact all external views that rely on it.
The conceptual level acts as a bridge between the external and internal levels, providing a logical representation that abstracts the physical storage details.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Describe the Internal level

A

The internal level is the lowest layer, dealing with the physical storage and retrieval of data.
It is concerned with how data is stored, indexed, and processed at the machine level.
The internal schema defines the storage structures, access paths, and techniques used for efficient data retrieval.
Changes at this level, such as modifications to indexing or storage methods, do not affect the external or conceptual levels.
The internal level provides a level of abstraction over the physical storage details, allowing changes in storage technology without impacting the higher levels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define data abstraction

A

Hide storage details and present the users with a conceptual view of the database
Data abstraction in databases refers to the process of simplifying complex details and presenting a high-level view of data to users and applications. It involves creating abstract representations of data that hide the implementation details and complexities of the underlying database structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define data independence

A

Data independence in databases refers to the ability to modify the database schema or organization without affecting the applications that use the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the difference between physical and logical data independence

A

Physical data independence hides how data is stored, allowing changes to storage structure without affecting users or applications while Logical data independence shields users from changes in the database structure, permitting modifications without disrupting user interactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is a database schema

A

Database schema: the description of the structure of the
database
Database schema: the description of the structure of the
database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a conceptual data model?

A

Identifies the high level data structure
* Independent of DBMS, application programs, physical considerations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a Logical data model

A
  • Describes the data in terms of data structures such as
    ― Records (nodes) and sets (edges)
  • e.g., Hierarchical model, network model
    ― Relational tables and columns
  • E.g., Relational model
    ― Object-oriented classes
  • E.g., Object-oriented model, object-relational model
  • Independent of a particular DBMS product and storage
    technology.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a physical data model

A
  • Describe how data is stored in the computer, representing
    record structures, record orderings, and access paths.
  • Highly dependent on the target DBMS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Define a relationship and what a degree of relationship means

A

Relationship (type): meaningful associations among
two or more entities.
* Degree of Relationship: The number of participating
entity types in a relationship e.g Binary if 2, ternary if three and Quaternary if 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is a recursive relationship?

A

Degree one: recursive
― A relationship between occurrences of the same entity type
― E.g., AcademicStaff (professor) supervises AcademicStaff (lecturer).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Define a candidate Key and a primary key

A

Candidate Key: The minimal number of attribute(s) whose
value(s) uniquely identify each entity occurrence
― A candidate key can not contain a null

Primary Key: The candidate key that is selected to uniquely identify
each occurrence of an entity type
* Principles of choosing the primary key:
― Attribute length
― Minimal number of attributes required
― The future certainty of uniqueness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is a strong/weak entity type?

A

Weak Entity Type: existence-dependent on some other
entity
― Each entity occurrence cannot be uniquely identified using only the attributes associated with that entity type

strong is not that ig

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Define multiplicity

A

Multiplicity: the number of possible occurrences of an
entity type that may relate to a single occurrence of an
associated entity type through a particular relationship

e.g that 1..* stuff

Representation of Multiplicity
Constraint
Meaning
0..1 Zero or one entity occurrence
1..1 (or just 1) Exactly one entity occurrence
0..* (or just ) Zero or many entity occurrences
1..
One or many entity occurrences
3..7 Minimum of 3 up to a maximum of 7entity
occurrences
1, 5, 10-12 1 or 5 or 10 to 12 entity occurrences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is cardinality and Participation in multiplicity ?

A

Cardinality: the maximum values for the multiplicity ranges
on either side of the relationship.
― One-to-one (1:1)
― One-to-many (1:)
― Many-to-many (
:*)

Participation: the minimum values for the multiplicity
ranges on either side of the relationship
― Optional participation (if 0)
― Mandatory participation (if 1 or more)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Which side of the relationship does participation appear on?

A

The participation of entities appears on the opposite side of the relationship - google if u confused it’s kinda wierd

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is a fan trap and when may it occur?

A

A fan trap is a type of join path between three tables when a “one-to-many” join links a table which is in turn linked by another “one-to-many” join. The fanning out effect of “one-to-many” joins can cause incorrect results to be returned when a query includes objects based on both tables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a chasm trap and when may it occur?

A

A chasm trap is a type of join path between three tables when two “many-to-one” joins converge on a single table, and there is no context in place that separates the converging join paths.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What are 9 steps of conceptual database design

A
  • [Step 1] Identify entity types
  • [Step 2] Identify relationship types
  • [Step 3] Identify and associate attributes with entity and
    relationship types
  • [Step 4] Determine primary keys for entity types
  • [Step 5] Identify the multiplicity constraints
  • [Step 6] Consider use of enhanced modeling (optional)
  • [Step 7] Check model for redundancy, traps
  • [Step 8] Validate conceptual model against user
    transactions
  • [Step 9] Review conceptual data model with user
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Explain specialization in terms of Entities

A

Specialisation: The process of maximizing the differences between members of an entity by identifying their distinguishing characteristics. e.g making more entities from one(Manager, SalesPersonnel and Secretary from Staff)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Explain generalization in terms of Entities

A

Generalisation: The process of minimizing the differences between entities by identifying their common characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are some properties of relations

A
  • A relation has a name that is distinct from all other relation
    names in the database
  • Each cell of the relation contains exactly one atomic (single) value
  • Each attribute has a distinct name
  • The values of an attribute are all from the same domain
  • Each tuple is distinct – there are no duplicate tuples
  • The order of attributes has no significance
  • The order of tuples has no significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a relation schema and a relation instance

A

Relation schema: a named relation defined by the relation name
(table name) and attributes.
― 𝑅(𝐴1, 𝐴2, … , 𝐴𝑛)
Student (stdID, stdEmail, stdFstName, stdSecName, gender, programCode)
Primary Key: stdID, Alternate Key: stdEmail
Foreign Key: programCode references Program(programCode)

  • Relation instance/state: a set of tuples from a relation schema.
    ― r(R)
    ^This is like an actual database table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Name the three Integrity constraints put on attribute values

A

― Entity integrity
― Referential integrity
― Domain constrain

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Define Entity Integrity

A

― Uniqueness: No two tuples can have identical values for candidate keys.
― Not-Null: No attribute of a candidate key can be a NULL.
* Null: A value of an attribute is currently unknown or not applicable
for this tuple.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Define referential integrity

A

Referential integrity: If a foreign key exists in a relation,
― either the foreign key value matches a primary key value in its home relation
― or the foreign key value is NULL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Define Domain constraints

A

Domain Constraints: enables the RDBMS to carry out checks
on data errors and to determine the range of operations
that can be carried out on the domain
― E.g., if the attribute Age is specified to be in the range16-30, any number outside would give an error
― E.g., If the attribute Name is to hold a set of characters, then any
multiplication or division operations on the attribute values are illegal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Define views

A

View: a subset of base relation(s)
Base relation: A named relation
corresponding to an entity in the conceptual
schema, whose tuples are physically stored
in the database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Name some features of views

A

Virtual/derived relation. It does not necessarily exist in the
database, but can be produced upon request.
* Dynamic. The changes made to the base relation(s) that affect the
view are immediately reflected in the view.
* Generated by applying appropriate relational operations (e.g., 𝜎,
𝜋, x, ⋈, ….)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is the purpose of views in databases

A
  • Security mechanism
  • Customize user’s need
  • Simplify complex operations on the base relations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

How are composite attributes represented in relation schemas

A

For composite attributes, include only the constituent simple attributes.
So for example staffName which is a composite attribute containing staffFName and staffLName would be ignored and the relation schema would look like this:
AcademicStaff(staffID, staffFName,staffLname, academicPosition, gender,
DOB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is Normalisation and what is the purpose of it?

A

Normalization: A technique for producing a set of relations with
desired properties.
― remove redundancy
― remove potential for insertion, modification, deletion anomalies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

If A and B are attributes of relation R, when is B functionally dependent on A? Out of A and B which one is the determinant

A

if each value of A is associated with exactly one value of B, this is denoted A -> B
In this case A is the determinant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

If A -> B and B -> C where is the transitive dependency

A

C is transitively dependent on A via B … A -> C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Given A -> B what would make this a partial functional dependency?

A

if by removing some attribute(s) from A, the dependency still
holds.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Given A -> B what would make this a full functional dependency

A

B is not functionally dependent on any subset of A.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What should be done to a database to put it into first normal form?

A

Repeating groups must be removed, a primary key must exist and duplicate rows must be removed. Additionaly there must be no order dependency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What should be done to a database to put it into second normal form?

A

All partial dependencies must be removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Define a partial dependency

A

Partial Dependency occurs when a non-prime attribute is functionally dependent on part of a candidate key.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What should be done to a database to put it into third normal form

A

All transitive dependencies must be removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Define a transitive dependency

A

A transitive dependency refers to some non-prime attribute other than the candidate key that depends on another non-prime attribute that is dependent entirely on the candidate key.
^this is hard to remember but it’s not that hard to actually do

Show_ID Telecast_ID Telecast_Type CD_Cost ($)
F08 S09 Thriller 50
F03 S05 Romantic 30
F05 S09 Comedy 20

In this case Telecast_Type needs to be removed cos it depends on telecast ID which in turn depends on Show ID

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What needs to occur for a database to be in Boyce-Codd Normal Form(BCNF)

A

Every determinant must be a candidate key
or
Every non-prime attribute must be functionally dependent on the entire primary key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What are the Unary operations that are represented by these symbols 𝜋, 𝜎, 𝜌

A

― Projection (𝜋 )
― Selection (𝜎)
― Rename (𝜌)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What are the set operations that are represented by these symbols
∪, ∩, −, X

A

― Union (∪)
― Intersection (∩)
― Set difference (−)
― Cartesian Product (X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What are the Binary operations represented by these symbols
⋈, (⋉, ⋊), ÷

A

― Join (⋈)
― Outer Join (⋉, ⋊)
― Division (÷)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

How is the projection operator used?𝜋

A

𝝅<𝒂𝒕𝒕𝒓𝒊𝒃𝒖𝒕𝒆>(𝑹): Produce a new relation that has only some of
𝑅‘s columns</𝒂𝒕𝒕𝒓𝒊𝒃𝒖𝒕𝒆>

e.g
𝝅ID, Name, Salary(Lecturer)

Removes duplicate tuples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

how is the selection operator used? 𝜎

A

𝝈<𝒔𝒆𝒍𝒆𝒄𝒕>(𝑹): Produce a new relation that contains only
those tuples of R that satisfy the condition.</𝒔𝒆𝒍𝒆𝒄𝒕>

e.g
𝝈𝒅𝒆𝒑𝒕𝑰𝑫=′𝟎𝟐′∧ 𝒂𝒈𝒆>𝟓𝟎(𝑳𝒆𝒄𝒕𝒖𝒓𝒆𝒓) or 𝝈barName=’Sues’(Sells)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

How is the rename operator used? 𝜌

A

𝝆𝑺 𝑬 : Rename the expression E to S
* 𝝆𝑺(𝒂𝟏,𝒂𝟐,…,𝒂𝒏) 𝑬 : Rename the relation E to S,
and also re-names the attributes as 𝑎1, 𝑎2, … , 𝑎𝑛.

e.g
Q: List all the professors, rename the output relation as Professor:
𝜌𝑃𝑟𝑜𝑓𝑒𝑠𝑠𝑜𝑟(𝜎𝑝𝑜𝑠𝑖𝑡𝑖𝑜𝑛=′𝑝𝑟𝑜𝑓𝑒𝑠𝑠𝑜𝑟′(𝐿𝑒𝑐𝑡𝑢𝑟𝑒𝑟))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

How is the union operator used? ∪

A

Will take the union of both tables (R U S ), both tables must have the same relation schema for this to work

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

How is the intersection operator used? ∩

A

Will take the intersections of both tables (R ∩ S ), both tables must have the same relation schema for this to work

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

How is the set difference operator used? -

A

Will take the difference (R - S), records that exist in both R and S are remove from the R table and then that’s outputted. Both tables must have the same relation schema for this to work
𝑅 − 𝑆 ≠ 𝑆 − 𝑅

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

How is the Cartesian product operator used?

A

This one is kinda weird
* 𝑹 × 𝑺: Produces a relation that is the
concatenation of every tuple of relation 𝑅 with
every tuple of relation 𝑆.
―If both 𝑅 and 𝑆 have the same attribute 𝐴, then use: 𝑅. 𝐴
and 𝑆. 𝐴

Basically just stick S on the back of R and if they have the same attribute then call it R.attribute and S.attribute. Like for example Bar.barName and Sells.barName

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What are all the join operations

A

― Theta join (Θ-join)
― Equijoin
― Natural join
― Outer join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Explain how theta (Θ) join works

A
  • 𝑹 ⋈<𝒋𝒐𝒊𝒏> 𝑺: Defines a relation that contains
    all combinations of tuples from 𝑅 and 𝑆 that satisfy the
    join condition</𝒋𝒐𝒊𝒏>
  • 𝑅 ⋈<𝑗𝑜𝑖𝑛> 𝑆 = 𝜎<𝑗𝑜𝑖𝑛>(𝑅 × 𝑆)</𝑗𝑜𝑖𝑛></𝑗𝑜𝑖𝑛>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Explain how Equijoin works

A
  • Equijoin: Produces all the combinations of tuples
    from that satisfy a join condition with only
    equality comparisons.
    ―Equijoin is a special case of theta-join.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Explain how natural join works

A
  • 𝑹 ⋈ 𝑺: an equijoin of the two relations R and S
    over all common attributes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

What is SQL used for

A
  • SQL is a standard language for interacting with a relational
    database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What does Data Manipulation Language do and what are the statements used in it?

A

It retrieves and updates statements:
-Select
-Insert
-Update
-Delete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What does Data Definition Language do and what are the statements used in it?

A

Define DB structure & control access:
― CREATE
― ALTER
― DROP
― GRANT
― REVOKE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

What are the 5 basic select conditions?

A
  • Comparison: =, <> (!=), <, <=, >=, >, OR, AND, NOT
  • Range: BETWEEN … AND, NOT BETWEEN … AND
  • Set membership: IN, NOT IN
  • Pattern match: LIKE, NOT LIKE
  • IS NULL, IS NOT NULL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

What are the 5 Aggregate functions?>

A
  • COUNT
  • MIN
  • MAX
  • SUM
  • AVG
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

What does a Select - Between statement look like?

A

SELECT *
FROM Products
WHERE Price BETWEEN 10 AND 15;
-Note you can add a not in front of between if needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What does a Select - In statement look like

A

SELECT *
FROM Customers
WHERE Country IN (‘Germany’, ‘France’, ‘UK’);
-Note you can add a not before IN for the opposite result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

What does a Select - Like statement look like?

A

SELECT *
FROM Customers
WHERE CustomerName LIKE ‘a%’;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

What are the two types of wildcards in a select Like statement and what do they do?

A

% – represents zero, one, or multiple characters
_ – represents a single character

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

How can Is Null/ Is not Null be used in select statements

A

SELECT CustomerName
FROM Customers
WHERE Address IS NULL ;

-Not it must be IS NULL or IS NOT NULL, the IS is necessary

82
Q

What are the 5 aggregate functions?

A
  • COUNT
  • MIN
  • MAX
  • SUM
  • AVG
83
Q

Write a query to count how many countries there are in the customers table

A

SELECT COUNT(DISTINCT Country)
FROM Customers;

84
Q

Write a query to calculate the minimal price, maximal price and average price of all products.

A

SELECT MIN(Price), MAX(Price), AVG(Price)q
FROM Products;

85
Q

What are GROUP BY and HAVING used for in a SELECT statement

A

GROUP BY: forms group of rows with the same column value.
HAVING: filters the groups subject to some condition

This is an example of where they are used in a query

SELECT [DISTINCT] column_name(s)
FROM table_name(s)
[WHERE condition]
[GROUP BY column_name(s)]
[HAVING condition]
[ORDER BY column_name(s)];

86
Q

Using GROUP BY how can i create an SQL statement to count how many customers are from each country

A

SELECT COUNT(CustomerID), Country
FROM Customers
GROUP BY Country;

87
Q

In Grouping queries what can the SELECT list only contain?

A

― The grouping columns
― Aggregate functions.

An example is this:
SELECT COUNT(CustomerID), Country, City
FROM Customers
GROUP BY Country, City;

88
Q

When should the HAVING clause replace the WHERE clause?

A

When aggregate functions are involved;
-HAVING clause always includes at least one aggregate function.
-WHERE clause can not include aggregate function.

89
Q

How do Aliases work?

A

An alias is a temporary name assigned to a table or a column for the duration of a query. Aliases are used to make the output of a query more readable or to shorten lengthy table or column names

90
Q

Define a subquery, which clause it must be inside and how it works

A

Subquery (or nested query) : a complete SELECT statement
embedded in another query.
― In WHERE, or HAVING clause

SELECT *
FROM Customers
WHERE Country = (SELECT Country
FROM Customers
WHERE CustomerName= ‘Ernst Handel’);

91
Q

How does IN / NOT IN work for subqueries?

A

Used in cases where the subquery will output more than one value
If the inner SELECT statement outputs a single value, use =
If multiple values, use IN

92
Q

How does EXISTS / NOT EXISTS work for subqueries?

A

The EXISTS operator returns true if the subquery returns one or more rows. It doesn’t matter what the actual values of the rows are; the existence of any row is sufficient for the EXISTS condition to be true.

93
Q

How do inner joins work?

A

Inner joins in SQL are used to combine rows from two or more tables based on a related column between them. The result includes only the rows where the specified condition is met.

94
Q

How do outer joins work?

A

Outer joins in SQL are used to retrieve rows from one or more tables, even if there is no match between the joined tables. Unlike inner joins, which return only the matching rows, outer joins return all rows from at least one of the tables, with the non-matching rows filled with NULL values for columns from the table that does not have a corresponding match.

95
Q

What are the types of outer join and the difference between them?

A

― LEFT JOIN: returns all rows from the left table
(table1), with the matching rows in the right table
(table2). The result is NULL in the right side when
there is no match.

― RIGHT JOIN: …

― FULL JOIN: Retain both the left table and right table
rows. The result is NULL for those no matches.

96
Q

How could i use inner join to satisfy this query? List all the customer’s orderID and order dates. Assume there is a customers and Orders table

A

SELECT o.OrderID, c.CustomerName, o.OrderDate
FROM Orders o
Natural JOIN Customers c;

97
Q

What does self join do?

A

A self join in SQL is a specific case where a table is joined with itself. This is useful when you want to compare rows within the same table, treating it as if it were two separate tables. To perform a self join, you typically use aliases to differentiate between the columns from the same table

98
Q

How does an INSERT INTO statement work

A

Insert adds new rows to the table
Syntax is:
INSERT INTO <table>
(Column1, Column2)
VALUES
(19, ‘ThatValue’),
(23, ‘ThisValue’);

or with a whole record the columns are not needed
INSERT INTO Orders
VALUES
(10447, 90, 6, ‘2017-05-25’,3);

99
Q

What is the syntax for an update statement

A

Example 1 : Give all the products’ price a 3% increase
UPDATE Products
SET Price=Price*1.03;

Example 2

UPDATE Customers
SET ContactName = ‘Alfred Schmidt’,
City= ‘Frankfurt’
WHERE CustomerID = ‘001’;

100
Q

What is the syntax for a delete statement

A

DELETE FROM <table>
WHERE <condition></condition>

101
Q

What is the syntax for table creation

A

CREATE TABLE IF NOT EXISTS Tasks (
task_id INT AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255) NOT NULL,
start_date DATE,
due_date DATE,
status TINYINT NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

IF NOT EXISTS is optional

102
Q

What are the main data types in MySql

A

Numeric types
-INT, TINYINT, DECIMAL, FLOAT, DOUBLE
string types;
-CHAR, VARCHAR, TEXT
Dat/time types
-DATE
-TIME
-DATETIME

103
Q

What are the 5 constraints?

A

– NOT NULL
– UNIQUE
– PRIMARY KEY
– FOREIGN KEY
– CHECK

104
Q

What does the NOT NULL constraint do?

A

NOT NULL: ensures values stored in a column are
not NULL.

105
Q

What does the UNIQUE constraint do?

A

UNIQUE: ensures values in a column or a group of
columns are different
used like :

CREATE TABLE table_name(
…,
column_name data_type UNIQUE,

);

106
Q

How can constraints work for more than one column

A

CREATE TABLE table_name(
…,
[CONSTRAINT constraint_name] UNIQUE(Column_list)
);

107
Q

What does the check constraint do?

A

CHECK: ensures that values stored in a column or group
of columns satisfy a Boolean expression.

used like :
CREATE TABLE table_name(
…,
column_name data_type CHECK(Boolean_expression),

);

108
Q

How is the primary key constraint used?

A

PRIMARY KEY
― Implicitly means NOT NULL + UNIQUE
― A table can have one and only one primary key.

CREATE TABLE table_name(
…,
column_name data_type PRIMARY KEY,

);

109
Q

How is the foreign key constraint used?

A
  • FOREIGN KEY: refers to the PRIMARY KEY in another table

-The domain of the foreign key must be the same as the domain used in the referenced table

CREATE TABLE Products(
productId INT AUTO_INCREMENT PRIMARY KEY,
productName VARCHAR(100) NOT NULL,
categoryId INT,
CONSTRAINT fk_category FOREIGN KEY (categoryId)
REFERENCES categories(categoryId)
) ENGINE=INNODB;

110
Q

What do these three referential options that can be applied to the foreign key do? CASCADE, SET NULL and RESTRICT/NO ACTION

A

― CASCADE: If a row from the parent table is deleted or updated, the
values of the matching rows in the child table automatically deleted or
updated.
― SET NULL: When referenced data in the parent key is deleted or updated,
all rows in the child table that depend on those values have their foreign
keys set to NULL.
― RESTRICT/NO ACTION: Reject the delete/update operation from the
parent table, if a row from the parent table has a matching row in the
child table.

111
Q

Using ALTER TABLE how could you add / drop columns?

A

ALTER TABLE Persons
ADD COLUMN dateOfBirth date NOT NULL,
ADD COLUMN joinDate date default CURRENT_DATE();

ALTER TABLE Persons
DROP COLUMN dateOfBirth;

112
Q

Using ALTER TABLE how could you modify existing columns?

A

ALTER ALTER TABLE Vehicles
MODIFY year SMALLINT NOT NULL,
MODIFY color VARCHAR(20) NULL AFTER make;

113
Q

Using ALTER TABLE how can i add or drop a table constraint?

A

Add a table constraint:
ALTER TABLE table_name
ADD [CONSTRAINT constraint_name] UNIQUE/CHECK/PRIMARY KEY/FOREIGN KEY (…);

drop a table constraint
ALTER TABLE table_name
DROP CONSTRAINT constraint_name;

drop primary/ foreign key
ALTER TABLE table_name
DROP PRIMARY KEY;

ALTER TABLE table_name
DROP FOREIGN KEY constraint_name;

114
Q

Using ALTER TABLE how can i add a constraint for a single table attribute?

A

ALTER TABLE Persons
ADD CONSTRAINT UC_person UNIQUE(ID,lastName);

115
Q

How do i drop a table in SQL?

A

DROP TABLE Shippers;

116
Q

What is AUTO_INCREMENT and how can it be assigned as a constraint

A

Auto-increment allows a unique number to be generated
automatically when a new record is inserted into a table.
― Typically for primary key
― In MySQL, assign AUTO_INCREMENT to a column

example:

CREATE TABLE Persons (
personId int NOT NULL AUTO_INCREMENT PRIMARY KEY,
lastName varchar(255) NOT NULL,
firstName varchar(255),
Age int
);

117
Q

How can we let the AUTO_INCREMEMENT start with another value instead of 1?

A

ALTER TABLE Persons AUTO_INCREMENT 100;
^add a number on the end ¯_(ツ)_/¯

118
Q

What is an index and what does it do?

A

A data structure such as B-Tree that improves the speed of
data retrieval on a table at the cost of additional writes and
storage to maintain it.

Note: Updating a table with indexes takes more time than updating a table without. So, only create indexes on columns that will be frequently searched.

119
Q

How can we index an attribute in SQL and how can we add an index to an already existing attribute?

A

INDEX ( attributeName )
like :
CREATE TABLE Persons (
personid int NOT NULL AUTO_INCREMENT PRIMARY KEY,
lastName varchar(255) NOT NULL,
firstName varchar(255),
age int,
INDEX (lastName)
);

add to existing attribute:

CREATE INDEX idx_persons ON Persons(lastName);

120
Q

Note : MYSQL automatically creates a special index named PRIMARY for the primary key

A
121
Q

How can i create a unique index and what does it do?

A

CREATE UNIQUE INDEX emp_u1
ON EMP (eName, job);

This will Index and ensure that each combination of eName and job are unique.

122
Q

What is the basic syntax to create a view?

A

CREATE [OR REPLACE] VIEW ViewName [column1,…] AS
SELECT …;

example horizontal view
CREATE VIEW Emp_D20 AS
SELECT * FROM Emp
WHERE deptNo = 20;

example vertical view
CREATE VIEW Emp_NoSalary AS
SELECT empNo, eName, job, mgr, hireDate, comm, deptNo
FROM Emp;

123
Q

How can i create a view for multiple tables?

A

CREATE VIEW Num_Emp_Job(deptNo,job,cnt) AS
SELECT deptNo,job,COUNT(*)
FROM Emp
GROUP BY deptNo, job
ORDER BY deptNo;

124
Q

What does view updatability mean and in what case will it not work?

A

All updates to a base table are immediately reflected in all views
that encompass that base table. Similarly, we may expect that if a
view is updated then the base table(s) will reflect that change.

― For a view to be updatable, the DBMS must be able to trace any row or column back to its row or column in the base table. e.g SELECT statement defining the view must be simple, meaning it doesn’t involve complex expressions, aggregates, GROUP BY, DISTINCT, JOINs, or subqueries

125
Q

Define a commit and a rollback

A
  • COMMIT
    ― Causes all outstanding changes that you have made to become
    permanent.
  • ROLLBACK
    ― Causes all uncommitted changes to be discarded
126
Q

Define database implementation

A
  • Physical realization of the database and application designs
    ― Use DDL to create database schemas
    ― Use DML to populate the data
    ― Develop the application programs
  • Database transactions are implemented using the DML, possibly
    embedded in a host programming language.
127
Q

Define Physical Database Design (PDD)

A

The process of producing a description of the implementation
of the database on secondary storage.
― Describe the base relations, file organizations, and indexes used to
achieve efficient access to the data, and any associate integrity
constraints and security measures.

128
Q

What are some prerequisites before PDD can occur?

A

― Logical database design
* Minimally in 3rd normal form
― Transaction analyses
* Most frequent transactions
* Most complex or resource-insensitive transactions
* Distributions of transactions, over time
* Most critical transactions to the applications
― Which DBMS software to use

129
Q

What are the steps of PDD

A

STEP1: Translate the logical data model for target DBMS
― 1.1 Design base relations
― 1.2 Design representation of derived data
― 1.3 Design general constraints
STEP2: Design file organizations and indexes
― 2.1 Choose file organizations
― 2.2 Choose indexes
― 2.3 Estimate disk space requirement
STEP3: Design user views
STEP4: Design security mechanisms
STEP5: Consider controlled redundancy
STEP6: Monitor and tune the operational system

130
Q

How do you design base relations in PDD?

A
  • Defining Tables and Columns: including naming objects,
    choosing data types and lengths, and handling null, default values
  • Defining Primary Keys: may considering the use of surrogate keys
  • Handling referential integrity for foreign keys.

^This is basically this thingy , OrderDetails(OrderID, ProductID, Quantity)

131
Q

What is a surrogate key?

A

A surrogate key is a unique identifier assigned to each record or row in a database table. Unlike natural keys, which are based on existing data attributes and have some inherent meaning, surrogate keys are system-generated and lack any meaningful information about the data they represent. Surrogate keys are typically used as primary keys in database tables.

132
Q

How do you represent derived data in PDD?

A

Remember derived data is the data that’s calculated based on the database and not physically stored

  1. Use a generated column
  2. Create a view
  3. Use triggers*
    Triggers are stored programs executed automatically to
    respond to specific events such as inserting, updating or
    deleting data.
  4. Calculate it each time
133
Q

What is the storage structure for a database?

A

A database is stored as a collection of files. A database file is
partitioned into fixed-length storage units called blocks.
What is a block?

Block/Page: A contiguous sequence of sectors from a single
track

134
Q

What are some properties about primary secondary and stable storage for databases and give an example of each

A

-Primary storage is volatile ( e.g main memory/cache memory
-Primary storage is where the operations happen
-Data blocks can be moved from primary to secondary storage
-Secondary storage is where the entire database is stored e.g(magnetic disk, flash drives etc)
-Stable storage is where the database is backed up (e.g RAID or a Redundant Array of Independent Disks)
-Stable storage is always non-volatile with independent failure modes

135
Q

Define File Organisation

A

The physical arrangement of data in a file

136
Q

What are 3 ways to organize blocks/pages in a file

A

― Heap (unsorted) files: Records are placed on disk in no particular
order
― Sequential (ordered) files: records are ordered by the value of a
specific attribute.
― Hash files: Records are placed on disk according to a hash function

137
Q

What are the two types of index and what do they do?

A

Clustered Index:
With a clustered index the rows are stored physically on the
disk in the same order as the index.
― Only one clustered index is allowed for each table
― If the indexed attribute chosen is a key of a relation, the index is
called primary index.

Non-Clustered(secondary) index:
An index which maintains a
logical ordering of data rows without altering the physical
ordering of the rows.
― A table could have several secondary indexes
― Foreign keys are good candidates for non clustering indexes.

138
Q

Define Deonormalization and when it should be used

A

Denormalization is the process of trying to improve the read
performance of a database, at the expense of losing some write
performance, by adding redundant copies of data or by
grouping data.

― If performance is unsatisfactory and a relation has a low update and a very high query rate.

139
Q

When attempting controlled redundancy why would we create new tables/views?

Why would we Split tables?

A

They can hold data derived in columns from other tables - stops it having to be derived freshly each time

Decompose very large relations into smaller and more
manageable pieces.

140
Q

What is a transaction?

A

Transaction: an executing program that forms a logical unit
of database operations
A transaction includes one or more database operations, e.g.,
insertion, deletion, updating, retrieval.

141
Q

Define ACID in database transactions and what each letter stands for

A
  • Atomicity: All or nothing
    e.g shouldn’t take money from A without giving it to B
  • Consistency: transforms the database from one
    consistent state to another consistent state
    e.g money isn’t lost or gained

Isolation: partial effects of incomplete transactions
should not be visible to other transactions
e.g. other transactions shouldn’t see such a change until completion.

  • Durability: successfully committed transactions are
    permanently recorded in the database, not lost, even in
    the event of a system failure
    e.g. After completion, such a change in A and B is saved in the database.
142
Q

What is the SQL to start a transaction

A

START TRANSACTION;

<operations>
COMMIT;

e.g

START TRANSACTION;

UPDATE Accounts
SET balance = balance - 100
WHERE account = 'A';

UPDATE Accounts
SET balance = balance + 100
WHERE account = 'B’;

COMMIT;
</operations>

143
Q

What is a database Schedule and name and define the two types

A

Schedule: an ordering of operations for concurrent transactions. There are Serial schedules and Non-serial schedules

Serial schedule: a schedule in which the operations for concurrent
transactions are not interleaved.
* But, unacceptable in practice.

Serial schedule: a schedule in which the operations for concurrent
transactions are not interleaved.
* But, unacceptable in practice.

144
Q

What are two desired properties for a schedule

A

Serializability and Recoverability

145
Q

Define a Conflicting operation in a database schedule

A

Conflicting operations: if two operations belong to different
transactions, access the same database item, and at least one
operation is a writing operation.

146
Q

Define when a Transaction has a recoverable schedule

A

A schedule is considered recoverable if, in the presence of failures (such as crashes or errors), it is possible to restore the database to a consistent state.

a schedule, where, for each pair of transactions Ti and Tj, if Tj reads a
data item previously written by Ti, then Ti should commit before Tj.

147
Q

Define concurrency control

A

Concurrency Control is the process of managing simultaneous
operations on the database without having them interfere with
one another.

148
Q

State the two types of concurrency control and what they ential

A

― Pessimistic: Assumes that conflict is likely and take steps to prevent it.
* Locking, timestamping
― Optimistic: Assumes that conflict is unlikely and only checks for it
when transaction commits

149
Q

What does a lock do in database transactions?

A

Locking: When a transaction is accessing the database, a lock may
deny access to other transactions to prevent incorrect results.
― A read operation must acquire a shared/read lock
― An update operation must acquire an exclusive/write lock

150
Q

What can locks be applied to?

A

Locks can be applied to:
― The whole database
― A file
― A page/a disk block
― A record
― A field value of a record

151
Q

In a shared/exclusive locking scheme what must be issued before any read(X)

A

A transaction must issue read_lock(X), or write_lock(X) before
any read(X).

152
Q

In a shared/exclusive locking scheme what must be issued before any write(X)

A

― A transaction must issue write_lock(X) before any write(X)

153
Q

In a shared/exclusive locking scheme what must be issued after read(X) or write(X)

A

― A transaction must issue unlock(X) after read(X) or write(X)

154
Q

Does a shared/exclusive locking scheme guarantee stability?

A

Using shared/exclusive locking scheme does not
guarantee serializability. More strict protocols are
required, e.g. two-phase locking (2PL).

155
Q

Define two-phase locking and each of the two phases in it

A

Two-Phase Locking (2PL): All locking operations precede the first
unlock operation in a transaction.
― Growing/Expanding phase: during which new locks on items can be acquired but none can be released;
― Shrinking phase: during which existing locks can be released but no new locks can be acquired.
* Advantage: it assures serializability

156
Q

What problem can two phase locking cause

A

It can cause a deadlock which is a circular situation where each of two (or more) transactions are waiting to acquire locks that are held by the other.

157
Q

How can deadlocks be prevented by using timeouts?

A

A transaction will wait for a (database defined) period
to acquire a lock.
* If this time runs out then the whole transaction is
rolled back and restarted.

158
Q

How can deadlocks be prevented by the use of a wait-for-graph(WFG)

A
  • A Wait-for Graph (WFG) is constructed
    ― node for each transaction
    ― directed edge from transaction T1 to transaction T2 if T1 is waiting to lock an item currently held by T2
  • A deadlock exists if the graph contains a cycle.
159
Q

What needs to happen when a deadlock is detected? One issue that can occur in deadlock prevention is starvation, define starvation?

A
  • Once deadlock is detected, DBMS needs abort one or more
    of the transactions.
    -The transaction with the minimum cost should be rolled back
  • Starvation: the same transaction is always chosen as a victim,
    and the transaction can never complete.
160
Q

What are the two deadlock prevention algorithms and what do they do?

A
  • Wait-die algorithm
    ― only older transactions can wait for younger ones
  • Wound-wait algorithm
    ― only younger transactions can wait for older ones
161
Q

Define a timestamp in transactions and explain timestamping

A

Timestamp: A unique identifier created by the DBMS that
indicates the relative starting time of a transaction.
― Either a logical counter or the system clock.
* Timestamping: a concurrency control protocol that orders
transactions in such a way that older transactions get priority in
the event of conflict

162
Q

Explain optimistic techniques

A

Optimistic techniques assume that conflict is rare.
― Before a transaction commits, a check is performed to determine
whether conflict has occurred.
― If yes, roll back and restart.

163
Q

Define a backup and the three different kinds

A

A backup is a copy of the database taken periodically and stored in a safe place which allows the database to be restored with an acceptable loss of data

  • Full backup
    ― The whole database.
  • Differential backup
    ― All changes made since the last full backup
  • Incremental
    ― All changes made since the last incremental backup`
164
Q

Define database recovery and the two main techniques used for it

A

Recovery: The process of restoring the database to a correct
state in the event of a failure.

Two main recovery techniques
― Log-based
― Shadow paging

165
Q

Explain a log and how it can be used as recovery

A

Log: a sequence of records, which maintains the records of
actions performed by transactions, recovery can be performed by reversing these operations

166
Q

Explain check-pointing in a log file and the operations involved

A

Checkpoints are scheduled at predetermined intervals and
involve the following operations:
― Write all log records in main memory to secondary storage
― Write the modified blocks in the buffers to secondary storage.
― Write a checkpoint record to the log file. This record contains
transactions which are active at the time of the checkpo

167
Q

When a crash occurs in a database how are logs and Checkpoints used?

A

When a failure occurs,
― The system check the log file backwards from the end to the last
checkpoint.
― REDO all the transactions that committed since the last checkpoint.
― UNDO all the transactions that were active at the time of the crash.

168
Q

What are the two types of database updating schemes and what do they entail?

A

Two types of DB updating schemes:
― Deferred update: Updates are not written to the database until after a transaction has reached its commit point.
― Immediate update: Updates are applied to the database as they occur without waiting to reach the commit point.

169
Q

Define shadow paging

A

Maintain two tables during the lifetime of a transaction:
― A current page table
― A shadow page table

During a transaction the current page is copied into a shadow page table and the shadow page table is never modified during transaction execution. If the transaction fails the current page table is discarded and the shadow table is used instead

170
Q

What are advantages and disadvantages of shadow paging?

A

Advantage:
― The overhead of maintaining the log file is eliminated.
― The recovery is significantly faster since there is no need for
UNDO or REDO operations.

Disadvantage:
― Data fragmentation
― Need for periodic garbage collection to reclaim inaccessible
blocks

171
Q

What are some things that can cause security issues in databases

A

Caused by
– Unauthorized users
– Authorized users
– Programmers
- DBA

172
Q

What can bad security cause a loss of (CIA)

A

― Confidentiality: the protection of data from unauthorized disclosure
― Integrity: the requirement that information is protected from improper
modification.
― Availability

173
Q

Explain Discretionary Access Control (DAC)

A

― Discretionary Access Control (DAC)
* The owner of the object specifies which subjects can access the object.
* Used by most commercial DBMSs
* Supported by SQL: GRANT and REVOKE

174
Q

Explain Mandatory Access control

A

Mandatory Access Control (MAC)
* The system (not the users) specifies which subjects can access specific data objects.
* Not supported by SQL

175
Q

How would i use SQL to grant rfc@localhost the INSERT privilege on classicmodels database?

A

GRANT INSERT
ON classicmodels.*
TO rfc@localhost;

Note: the opposite of GRANT is REVOKE

176
Q

How does the Bell-LaPadula model work?

A

A subject S is only allowed to read an object O if it has a higher level of clearance than the object

― Class(O): the classification of an object O (relation, tuple, column, view, operation)
― Class(S): the clearance of a subject S (user, account, program)

177
Q

What are the 4 security Levels in the Bell-LaPadula model?

A

Top Secret (TS) > Secret (S) > Confidential (C) > Unclassified (U)

178
Q

How can views be used for security?

A

The view mechanism provides a powerful and flexible security
mechanism by hiding parts of the database from certain users.

179
Q

What are the 3 different targets of encryption in a database?

A

― Disk
― Database
― Network traffic

180
Q

How can an audit be used in database security?

A

Rather than preventing certain actions, record who did what.
― Keep track of all operations from each user through each login
session.
― The audit trail (or system log) itself must be secured

181
Q

What is a Redundant Array of Independent Disks(RAID) and what is it used to do?

A

A data storage technique that combines multiple physical disk drive
components into one logical unit
*improves reliability
* increases performance
* improves availability

182
Q

What is Data striping is a technique used in a RAID, what does it do?

A

Data striping: the data is segmented into equal-size partitions (the
striping unit) which are transparently distribute across multiple
disks.
― Improves overall I/O performance by allowing multiple I/Os to be serviced in parallel.

183
Q

What is a Parity (or error correcting) scheme ?

A

Each byte may have a parity bit associated with it that records whether the number of 1s is even or odd.

― Even parity: number of 1’s including parity bit is even
* E.g., 0110110 → 01101100, 0101100 → 01011001
― Odd parity: number of 1’s including parity bit is odd
* E.g., 0110110 → 01101101, 0101100 → 01011000
― Improves reliability

184
Q

What are the 4 different RAID schemes?

A

― RAID 0 : striping
― RAID 1: mirroring
― RAID 10
― RAID 5

185
Q

What is a pro and a con for RAID 0: striping

A

― Pros: improved I/O performance
― Cons: complete vulnerability to drive failures

186
Q

What is a pro and a con for RAID 1: mirroring

A

― Pros: Fast read, availability improvement
― Cons: Write throughput is always slower because every drive
must be updated. Most expensive storage solution.

187
Q

What is a pro and a con for RAID 10: striping and mirroring

A

― Pros: improved I/O performance, rebuilding time is fast if one drive fails.
― Cons: Still expensive because of mirroring.

188
Q

What is a pro and a con for RAID 5: striping and parity

A

― The parity data is also striped across all the disk.
― Pros: Resilience to a single drive failure, fast reading.
― Cons: Writing is slower (due to the parity needed be calculated), complex technology.

189
Q

What is a distributed database?

A

Distributed database: a collection of multiple interconnected
databases, which are
― spread physically across various locations
― communicate via a computer network.

190
Q

What are two desired characteristics for distributed databases?

A
  1. Distribution/location transparency
    ― Users perceive the database as a single, logical entity.
  2. Local autonomy
    ― Local data is locally owned and managed.
    ― Each local database operated independently.
    ― Local systems are able to operate effectively even if remote databases
    are not available.
191
Q

What are two types of distributed Database?

A

Homogeneous
― All sites use the same type of DBMS.

Heterogeneous
― Sites run different DBMSs.
― Translation using gateways: convert the language and model of each different DBMS into the language and model of the relational system.

192
Q

What is Data fragmentation in databases?

A

Fragmentation is to divide a table into a set of smaller tables to be
stored in different sites. The subsets of the table are
called fragments.
― Horizontal
* Subsets of rows are stored at different sites.
― Vertical
* Subsets of attributes are stored at different sites.
― Hybrid (Mixed)

193
Q

What is the correctness rule in Data Fragmentation

A

Correctness rules:
― Completeness: Each data item can be found in R must appear in at least one fragment.

― Reconstruction
* Horizontal: 𝑅 = 𝑅1 ∪ 𝑅2 … ∪ 𝑅𝑛
* Vertical: 𝑅 = 𝑅1 ⋈ 𝑅2 … ⋈ 𝑅𝑛

― Disjointness
* Horizontal: The fragments are disjoint.
* Vertical: The fragments are disjoint except for the primary key

194
Q

Define data allocation

A

Data allocation: The process of assigning each fragment to a
particular site in the distributed system.

195
Q

What are three types of Data replication and allocation?

A
  • Fully replication
    ― A complete copy of the DB at each site
  • Non-redundant allocation
    ― Each fragment is stored at exactly one site
  • Partial replication
    ― Certain transactions that access particular parts of the DB are mostly submitted at a particular site, that fragment can be allocated at that site only.
    ― Data that is accessed at multiple sites can be replicated at those sites. If many updates are performed, it may limit replication.
196
Q

What are some advantages of a distributed database?

A
  • Reflects organizational structure
  • Easier expansion
  • Increased reliability and availability
  • Better response/Improved performance
197
Q

What are some disadvantages of a distributed database?

A
  • Complex to design and implementation
  • Harder to control security and integrity
  • Lack of standard and experience
198
Q

Define a data warehouse and some properties

A

Data warehouse is a database, which
– is kept separate from the organization’s operational database,
– no frequent updating
– possesses consolidated historical data
– helps the organization to analyse its business

199
Q

What are some key features on a data warehouse

A

― Subject Oriented − e.g., a sales data warehouse, or a credit card
transaction data warehouse, etc.
― Integrated − data combined from multiple sources.
― Time Variant − looking for changes over time
― Non-volatile − the previous data is not erased when new data is added to it. Frequent changes in operational database is not reflected in the data warehouse.

200
Q

Define OLAP

A

Online Analytical Processing (OLAP):
― Uses multi-dimensional view of aggregate data to provide quick access to strategic information

201
Q
A