All Flashcards

Get Smarter Yo

1
Q

What is a database?

A

A shared collection of logically related data designed to meet needs of organisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a system catalog (metadata)?

A

Provides description of data to enable program-data independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What defines logically related data?

A

Comprises of entities, attributes, and relationships of an organisations information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a Database Management System?

A

Software system enabling users to define, create, maintain and control access to a database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What control services does a DBMS provide? (4 total)

A

 A security system
 An integrity system
 A concurrency control system
 A user-accessible catalog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What services does a DBMS provide? (10 total)

A
	Data storage, retrieval and update
	User-accessible catalog
	Transaction support
	Concurrency control services
	Recovery services
	Authorisation services
	Support for Data Communication
	Integrity Services
	Services to promote data independence
	Utility services
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What advantages does a DBMS provide? (14 total)

A
	Control of data redundancy
	Data consistency
	More information from same amount of data
	Sharing data
	Improved data integrity
	Improved security
	Enforcement of standards
	Economy of scale
	Balance conflicting requirements
	Improved data accessibility and responsiveness
	Increased productivity
	Improved maintenance through data independence
	Increased concurrency
	Improved backup and recovery services
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What disadvantages does a DBMS bring with it? (7 total)

A
	Complexity
	Size
	Cost
	Additional hardware costs
	Cost of conversion
	Performance
	Higher impact of failure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a DB Application Program?

A

A computer program that interacts with a DB by issuing an appropriate request to the DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does a Data Definition Language do?

A

A DDL permits specification of data types, structures and any data constraints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does a Data Manipulation Language do?

A

A DML is a general enquiry facility, a query language, of the data, providing basic manipulation operations on data in the DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What defines a procedural DML do?

A

Allows users to tell system exactly how to manipulate the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What defines a Non-Procedural DML?

A

User states what data is needed, not how to retrieve it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a data model?

A

An integrated collection of concepts for describing data, relationships between data, and constraints on the data in an organisation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What components make up a data model? (3 total)

A

 Structural part
 Manipulative part
 Possibly a set of integrity rules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of a data model?

A

Provides a way to represent data in an understandable way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What categories of data model are there? (3 total)

A

 Object Based
 Record Based
 Physical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a system catalog? ( 2 points)

A

A repository of information, or metadata, describing the data in the DB and a fundamental component of any DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is middleware?

A

Software that mediates with other software and allows for communication between disparate applications in a heterogeneous system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When is middleware needed?

A

When distributed systems become too complex to manage efficiently without a common interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a Web Service?

A

A software system designed to support interoperable machine-to-web service machine interaction over a network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does XML stand for?

A

Extensible markup language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does SOAP stand for?

A

Simple object access protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does SOAP do?

A

Platform and language independent communication protocol for exchanging structured information over the internet and based on XML.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does WSDL stand for?

A

Web services description language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does WSDL do?

A

Protocol based on XML and used to describe and locate web services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What does UDDI stand for?

A

Universal discovery, description and integration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does UDDI do?

A

Platform independent, XML based registry protocol for businesses to list themselves on the internet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is a distributed database?

A

A logically interrelated collection of shared data, physically distributed over a computer network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the cloud computing?

A

A model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources that can be rapidly provisioned and released with minimal management effort or service provider interaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is a transaction?

A

An action or series of actions carried out by a user or applications, which reads or updates DB contents, forming a logical unit of work on the DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What do each of the ACID properties stand for, and what do they mean?

A

Atomicity
 Transactions treated as single unit

Consistency
 Transactions only bring DB from one valid state to another

Isolation
 Ensures concurrent execution of trans leaves DB in same state as if they were executed sequentially
 Main goal of concurrency control

Durability
 Guarantees transaction remains committed, even after system failure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What are the objectives of a Three-Level Architecture? (6 total)

A
  • All users should be able to access the same data
  • A users view is immune to changes made in other views
  • Users should not need to know physical DB storage details
  • DBA should be able to change DB storage structures without affecting users’ views
  • Internal structure of DB should be unaffected by changes to physical aspects of storage
  • DBA should be able to change conceptual structure of DB without affecting users’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the external level and what does it describe?

A
  • Users view of the DB

* Describes part of the DB that is relevant to a particular user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is the conceptual level and what does it describe?

A
  • Community view of the DB

* Describes what data is stored in DB and relationships among data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What is the internal level and what does it describe?

A
  • Physical representation of the DB on a computer

* Describes how the data is stored in the DB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What does logical data independence provide? (3 total)

A
  • Immunity of external schemas to changes in conceptual schema
  • Conceptual schema changes
  • Shouldn’t require changes to external schema or rewrites of application programs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What does physical data independence provide? (3 total)

A
  • Immunity of conceptual schema to changes in internal schema
  • Internal schema changes
  • Shouldn’t require changes to conceptual or external schemas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What defines an object based data model? (4 total)

A
  • Entity relationship
  • Schematic
  • Functional
  • Object-Oriented
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What types of record based data model exist? (3 total)

A
  • Relational Data Model
  • Network Data Model
  • Hierarchy Data Model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What information does a system catalog store? (5 total)

A
  • Names, types and sizes of data items
  • Constraints on the data
  • Names of authorised users
  • Data items accessible by a user and the type of access
  • Usage statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

In a multi-user DBMS architecture, what defines teleprocessing? (3 total)

A
  • One computer with a single CPU and a number of terminals
  • Processing performed within the same physical computer
  • Terminals are ‘dumb’, incapable of functioning on their own, cabled to central computer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

In a multi-user DBMS architecture, what defines a file-server architecture? (4 total)

A
  • Processing distributed about the network, typically a local area network
  • File server connected to several workstations across a network
  • DB resides on the file-server
  • DBMS and applications run on each workstation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are the disadvantages of a file-server mulit-user DBMS architecture? (3 total)

A

o Significant network traffic
o Copy of DBMS on each workstation
o Concurrency, recovery, and integrity control more complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What defines each of the tiers in a two-tier client server architecture?

A
  • Tier 1: Client manages UI and runs applications

* Tier 2: Server holds DB and DBMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What are the advantages of a two-tier client server? (5 total)

A
o	Wider access to existing DBs
o	Increased performance
o	Possible reduction in hardware costs
o	Reduction in communication costs
o	Increased consistency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What makes a three-tier client server better than a two-tier server DBMS architecture?

A

Much more scalable compared to the two-tier client server system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What are the advantages of a three-tier client server? (5 total)

A

o ‘Thin’ client requiring less expensive hardware
o Application maintenance centralised
o Easier to modify or replace one tier without affecting others
o Separating business logic from DB functions makes it easier to implement load balancing
o Maps quite naturally to Web environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What defines an n-tier architecture?

A

Application servers host API to expose business logic and business processes for use by other applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What do you gain from extending a three-tier architecture to n-tier?

A

Additional tiers provide more flexibility and scalability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What defines a web service and service-oriented architecture? (3 total)

A
  • Shares business logic, data, and processes through a programmatic interface across a network
  • Developers can add web service to a web page, or to an executable program, to offer specific functionality to users
  • Use technologies such as: XML, SOAP, WSDL, UDDI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What three factors can describe a distributed database in greater detail? (3 total)

A
  • Software system that permits the management of the distributed DB and makes the distribution transparent to users
  • Consists of a single logical DB split into a number of fragments, each stored on one or more computers under the control of separate DBMS, with the computers connected by a network
  • Each site capable of independently processing user requests that require access to local data and is also capable of processing data stored on other computers in the network
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

In cloud computing, what is an on-demand self-service?

A

Consumers can obtain, configure an deploy cloud services without help from the provider

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

In cloud computing, what is a broad network access service?

A

Accessible from anywhere, from any standardised platform

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What subclasses of cloud computing SaaS’s exist and how do they differ?

A

DaaS and DBaaS, differing only in data management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What defines a DBaaS? (3 total)

A
  • Offers full DB functionality to application developers
  • Management layer that provides continuous monitoring and configuration of the DB
  • Spares developers from the ongoing DB administration tasks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What are the advantages of a DBaaS? (4 total)

A

o Optimised scaling
o High availability
o Multi-tenancy
o Effective resource allocation in the cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What defines a DaaS? (3 total)

A
  • Services enable data definition in the cloud and subsequent querying
  • Doesn’t implement typical DBMS interfaces but instead data is accessed via common APIs
  • Enables organisation with valuable data to offer access to others
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What components make up a DBMS and what are their functions? (14 total)

A

Query processor
• Transforms queries into a series of low-level instructions directed to the DB manager

DB Manager
• Interfaces with user-submitted application programs and queries
• The DM examines external and conceptual schemas to determine conceptual records to satisfy the request
• DM then paces call to file manager to perform request

File Manager
• Manipulates underlying storage files and manages allocation of storage space on disk
• Establishes and maintains list of structures and indexes defined in the internal schema

DML Pre-processor
• Converts DML statements into standard function calls in host language.
• Must interact with query processor to generate appropriate code

DDL Compiler
• Converts DDL statements into a set of tables containing metadata
• Tables then stored in system catalog while control information is stored in data file headers

Catalog Manager
• Manages access to and maintains the system catalog

Authorisation Control
• Confirms whether the user has the necessary permission to carry out the required operation

Command Processor
• Control passed to this component once user authority is confirmed

Integrity Checker
• Ensures requested operation satisfies all necessary integrity constraints for an operation that changes the DB

Query optimiser
• Determines optimal strategy for query execution

Transaction manager
• Performs required processing of operations received from transactions.

Scheduler
• Ensures concurrent operations on DB proceed without conflicting with one another
• It controls the relative order in which transaction operations are executed

Recovery manager
• Ensures the DB remains in a consistent state in the presence of failures
• It is responsible for transaction commit and abort.

Buffer manager
• Responsible for the transfer of data between main memory and secondary storage, such as disk and tape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What does a transaction do?

A

Transforms DB from one consistent state to another, although consistency may be violated during a transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What defines a successful transaction?

A

Transaction is committed to the DB, reaching a new consistent state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What defines a transaction failure?

A

Transaction aborts, and the DB must be restored to a consistent state before it is started via roll-back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Can a transaction be aborted?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Can an aborted transaction be restarted?

A

Yes, provided it is rolled back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What is concurrency control in the context of transaction management? (2 total)

A

• Process of managing simultaneous operations on the DB without having them interfere with one another• • Prevents interference when two or more users are accessing DB simultaneously and at least one is updating data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What is the objective of concurrency control?

A

Objective is to schedule transactions to avoid interference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What limitation does the serial running of transactions bring with it?

A

Limits degree of concurrency or parallelism in system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What does the lost update problem refer to?

A

Successfully completed update is overwritten by another user

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What does the uncommitted dependency problem refer to?

A

Occurs when one transaction can see intermediate results of another transaction before it is committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

What does the inconsistent analysis problem refer to?

A

Occurs when transaction reads several values but second transaction updates some of them during execution of first

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

What does serialisability do?

A

Identifies those executions of transactions guaranteed to ensure consistency and aims to find non-serial schedules that allow transactions to execute concurrently without interfering with one another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

What is a schedule?

A

Sequence of reads/writes by set of concurrent transactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

What is a serial schedule?

A

Schedule where operations of each transaction are executed consecutively without any interleaved operations from other transactions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

What is the main issue with a serial schedule?

A

No guarantee that results of all serial executions of a given set of transactions will be identical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

What is a non-serial schedule?

A

Operations from set of concurrent transactions are interleaved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

What defines a serialised schedule?

A

A schedule is serialisable if it is non-serial in nature and identical to some serial schedule

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

What are the three rules to consider when identifying potentially conflicting transactions?

A
  • If two transactions only read a data item, they do not conflict and order is not important
  • If two transactions either read or write separate data items, they do not conflict and order is none important
  • If one transaction writes a data item and the other reads or writes, order of execution is important
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

What does a conflict serialisable schedule do?

A

Orders any conflicting operations in same way as some serial execution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

How do we test for serialisability? (workflow, 4 stages)

A

Create a precedence graph as follows:

  1. Create a node for each transaction
  2. Add a directed edge from Ti to Tj if Tj reads the value of an item written by Ti
  3. Add a directed edge from Ti to Tj if Tj writes the value into an item after it has been read by Ti

If the precedence graph contains a cycle, then the schedule is not conflict serialisable

80
Q

What defines view serialisability and how does it differ from conflict serialisability? (4 total)

A
  • Schedule is view serialisable if it is view equivalent to a serial schedule
  • It is a less stringent definition of schedule equivalence than conflict serialisability
  • Every conflict serialisable schedule is view serialisable, although the converse is not true
  • Any view serialisable schedule that is not conflict serialisable contains one or more blind writes
81
Q

What algorithmic time complexity class does testing the serialisability of a schedule fall under?

A

Testing if a schedule is serialisable is NPC

82
Q

When are two schedules view equivalent? (3 total)

A
  • For each data item x, if Ti reads initial value of x in S1, Ti must also read initial value of x in S2.
  • For each read on x by Ti in S1,if value read by Ti is written by Tj, Ti must also read value of x produced by Tj in S2.
  • For each data item x, if last write on x performed by Ti in S1, same transaction must perform final write on x in S2.
83
Q

How might we examine the recoverability of transactions within a schedule? (2 total)

A
  • If transactions fail, atomicity requires effects of transaction to be rolled back
  • Durability states that once a transaction commits, its changes cannot be undone without running another compensating transaction
84
Q

What defines a recoverability schedule?

A

A schedule where, for each pair of transactions Ti and Tj, if Tj reads a data item previously written by Ti, then the commit operation of Ti precedes the commit operation of Tj

85
Q

What is the conservative approach to concurrency control?

A

Delay transactions in case they conflict with other transactions

86
Q

What is the optimistic approach to concurrency control?

A

Assume a conflict is rare and only check for conflicts at commit

87
Q

What is locking in the context of concurrency control?

A

Locks are used to deny access of other transactions, preventing incorrect updates. Common approach

88
Q

What are the two forms of lock in concurrency control, and what are their actions?

A

Shared lock, used for reading data items

Exclusive lock, used for writing data items

89
Q

What are the 5 rules of locking in concurrency control?

A
  • If a transaction has a shared lock on item, it can read but not update said item
  • If transaction has exclusive lock on an item, it can both read and write said item
  • Reads cannot conflict, so more than one transaction can hold shared locks simultaneously
  • Exclusive locks gives transaction exclusive access to an item
  • Some systems allow transactions to upgrade read locks to exclusive locks, or downgrade exclusive to shared locks
90
Q

Whats the main issue with an incorrect locking schedule and what effect does this have on ACID principles?

A

Transactions release locks too soon

Results in the loss of total isolation and atomicity

91
Q

How can we guarantee serialisability whilst locking?

A

Through the addition of a protocol concerning the positioning of lock and unlock operations in every transaction

92
Q

What is two-phase locking (2PL) and what are each of the phases?

A

Transaction follows 2PL if all locking operations precede first unlock operation in the transaction

Growing phase: acquires all locks but cannot release any

Shrinking Phase: releases locks but cannot acquire any new locks

93
Q

What can be said about a schedule following 2PL?

A

The schedule is serialisable

94
Q

What issues can arise through the use of 2PL?

A

Problems can occur with interpretation of when locks can be released.

95
Q

What is Cascading Rollback? (2 total)

A

Occurs when transaction aborts with other transactions dependent on said transaction

Results in initial transaction and all subsequent dependent ones rolling back, cascading

96
Q

How can cascade rollback be prevented?

A

2PL, leaving the release of all locks till the end of the transaction

97
Q

How can 2PL be used to implement concurrency control with index structures?

A
  • Could treat each page of index as a data item and apply 2PL
  • As indexes frequently accesses, particular at higher levels, this may lead to lock contention
98
Q

Explain the key points on index traversal in the context of concurrency control with index Structures. (3 total)

A
  • Search path starts from root and moves down to leaves but search never moves back up tree. Thus, once a lower‐level node has been accessed, higher‐level nodes in that path will not be used again.
  • When new index value is inserted into a leaf, if node is not full, insertion will not cause changes to higher‐level nodes.
  • Only have to exclusively lock leaf node in such a case, and only exclusively lock higher‐ level nodes if node is full and has to be split
99
Q

Explain the concurrency lock strategy for searches. (2 total)

A

o Obtain shared locks on nodes starting at root and proceeding downwards along required path.
o Release lock on node once lock has been obtained on the child node.

100
Q

Explain the concurrency lock strategy for insertions. (1 total)

A

Conservative approach would be to obtain exclusive locks on all nodes as we descend tree to the leaf node to be modified.

101
Q

Explain the optimistic approach lock strategy for concurrency control. (3 total)

A

o Obtain shared locks on all nodes as we descend to leaf node to be modified, where obtain exclusive lock.
o If leaf node has to split, upgrade shared lock on parent to exclusive lock.
o If this node also has to split, continue to upgrade locks at next higher level.

102
Q

What is a deadlock and how do they arise?

A

An impasse that may result when two or more transactions are each waiting for locks held by the other to be released.

103
Q

How do we break a deadlock?

A

Abort one or more of the transactions

104
Q

Why should a deadlock be transparent to a user?

A

So that the DBMS can restart transactions

105
Q

Why can’t a DBMS restart an aborted transaction in practice? When would it be possible to do this?

A

Because it is unaware of transaction logic, even if it was aware of the transaction history.

Possible if there is no user input or the input is not a function of the DB’s state

106
Q

What are the three methods for handling deadlocks?

A

Timeouts

Deadlock Prevention

Deadlock Detection and Recovery

107
Q

Describe the action of a deadlock timeout. (3 total)

A
  • Transaction requesting lock will only wait for a system-defined period of time
  • If lock is not granted within this period, lock request times out
  • DBMS assumes transaction deadlocked and it aborts, automatically restarting system
108
Q

Describe the action of deadlock prevention. (2 total)

A
  • DBMS looks ahead to see if deadlocks will occur and never allows it to occur
  • Transactions ordered using transaction timestamps: wait-die or wound-wait
109
Q

What are the two methods of ordering transactions using timestamps in deadlock prevention? Describe their actions.

A

Wait-Die:
 Only an older transaction can wait for younger one
 Otherwise transaction is aborted and restarted with same timestamp

Wound-Wait:
 Only a younger transaction can wait for an older one
 If older transaction requests lock held by younger one, younger aborted

110
Q

What defines deadlock detection and recovery?

A

DBMS allows deadlock to occur but recognises it and breaks it

111
Q

What is the purpose of a wait-for graph and how is one constructed? (2 steps)

A

A wait-for graph is used to show transaction dependencies with a deadlock existing iff the wait-for graph contains a cycle

  1. Create a node for each transaction
  2. Create an edge from Ti to Tj if Tj waiting to lock an item locked by Tj
112
Q

What three considerations must recovery from deadlock detection consider?

A

o The choice in deadlock victim
o How far to roll a transaction back
o Avoiding starvation

113
Q

What is a timestamp?

A

A unique identifier created by a DBMS that indicates relative starting time of a transaction

114
Q

What are the two methods for timestamp generation?

A

o Using system clock at time transaction started

o Incrementing a logical counter every time a new transaction starts

115
Q

How are transactions ordered in timestamping and what effect does this have in the context of conflicts?

A

Transactions are ordered globally so that older transactions get priority in the event of conflict.

Conflicts are resolved by rolling back and restarting the transaction

116
Q

Are deadlocks possible during timestamping?

A

No, no locks are implemented during timestamping

117
Q

How are read and writes organised during timestamping?

A
  • Read/write proceeded only if last update on data item was carried out by an older transaction
  • Otherwise, transaction requesting read/write is restarted and given a new timestamp
118
Q

What is a read-timestamp?

A

Timestamp of last transaction to read a data item

119
Q

What is a write-timestamp?

A

Timestamp of last transaction to write a data item

120
Q

What benefit does multi-version timestamp ordering offer over conventional timestamping methodologies? (8 total)

A
  • Versioning implemented to increase concurrency
  • Basic timestamp ordering protocol assumes only one version of data item exists
  • Basic timestamping only one transaction can access data item at a time
  • Multi-version can allow multiple transactions to read and write different versions of the same data item
  • Ensures each transaction sees consistent set of versions for all data items it accesses
  • Each write operation creates a new version of the data item whilst retaining old version
  • System selects the correct version of data item when read request arrives
  • Versions can be deleted when they are no longer required
121
Q

What assumption are optimistic techniques based upon?

A

Based on the assumption that conflict is rare and more efficient to let transactions proceed without delays than ensure serialisability

122
Q

What happens at transaction commit during optimistic techniques? (2 total)

A

A check is made to determine whether conflict has occurred or not

If there is a conflict, transaction must be rolled back and restarted

123
Q

What key benefit is gained from using optimistic techniques in the context of transaction management?

A

Potential for greater concurrency than traditional protocols

124
Q

What are the three phases of optimistic techniques and what are each of their actions? One phase has two variants

A

Read phase
• Extends from start until immediately before commit
• Transaction reads values from database and stores them in local variables
• Updates applied to a local copy of the data

Validation phase
- Read only transaction:

• Checks that data read still holds current values
• If no interference has occurred, transaction committed, else it is aborted and restarted
- Update transaction:
• Checks transaction leaves DB in a consistent state with serialisability maintained

Write Phase
• Updates made to local copy are applied to the DB.

125
Q

How does granularity effect a system? (3 total)

A

The coarser the data item, the lower the degree of concurrency achievable

The finer the data item, the more reliant the system becomes on locking

Optimal size therefore determined by the type of transaction

126
Q

Describe the action of a granularity hierarchy and explain how a DBMS comes into play. (4 total)

A
  • Granularity of locks can be represented in a hierarchal structure
  • Root node represents the entire DB
  • When a node is locked, all of its children are also locked
  • DBMS should check hierarchal path before granting lock
127
Q

How is a Database restored?

A

Through the process of database recovery, restoring a DB to a correct state in the event of failure

128
Q

What two points explain the need for recovery control?

A
  • Volatile storage does not survive system crashes
  • Stable storage represents information that has been replicated in several non-volatile storage media with independent failure modes
129
Q

List the types of failure than can lead to a need for database recovery (6 total)

A
  • System crashes resulting in the loss of main memory
  • Media failures, resulting in the loss of parts of secondary storage
  • Application software errors
  • Natural physical disasters
  • Carelessness or unintentional destruction of data or facilities
  • Sabotage
130
Q

Describe the purpose and action of the recovery manager during failure? (3 total)

A
  • Recovery manager responsible for atomicity and durability
  • Failure occurring between commit and DB buffers being flushed to secondary storage requires recovery manager to redo transactions to ensure durability
  • Transaction not being committed at failure time, recovery manager has to undo any effects of the transaction for atomicity
131
Q

What is the difference between a partial and global undo?

A

Partial undo’s result in a single transaction being undone, whereas global undos result in all transactions being undone.

132
Q

List the 4 main recovery facilities and their actions.

A

Backup mechanism: makes periodic backup copies of the DB

Logging facilities: keep track of current state of transactions and DB changes

Checkpoint facility: enables updates to DB in progress to be made permanent

Recovery manager: allows DBMS to restore DB to consistent state following failure

133
Q

Describe a log file (4 total)

A
  • Contains information regarding all updates to a DB
  • Transaction record contains transaction identifier, type of log record, log-management information etc…
  • Log files may be duplexed or triplexed
  • Sometimes split into two separate random-access files
134
Q

Whats the main issue that can arise through the use of a log file?

A

Potential bottleneck arises which is critical in determining overall performance

135
Q

What is a checkpoint, what does it record and how is it used at the point of failure?

A
  • A checkpoint is a point of synchronisation between DB and log file, with all buffers forcibly written to disk
  • Checkpoint record is created containing identifiers of all active transactions
  • When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash
136
Q

What recovery technique is used if the DB is damaged?

A

Restore last backup of DB and reapply updates of committed transactions using log file

137
Q

What recovery technique is used if the DB is only inconsistent? (3 total)

A
  • Need to undo changes that caused inconsistency
  • May also need to redo some transactions to ensure updates reach secondary storage
  • Do not need backup, but can restore DB using before and after images in the log file
138
Q

What is meant by the term deferred update in the context of a recovery technique? (3 total)

A
  • Updates not written to the DB until after a transaction has reached its commit point
  • If transaction fails before commit, it will not have modified DB and so no undoing of changes are required
  • May be necessary to redo updates of committed transaction as their effect may not have reached DB
139
Q

What is meant by the term immediate update in the context of a recovery technique? (4 total)

A
  • Updates applied to DB as they occur
  • Need to redo updates of committed transactions following a failure
  • May need to undo effects of transactions that had not committed at time of failure
  • If no transaction commit record in log, transaction was active at failure and undone
  • Undo operations are performed in reverse order in which they were written to the log
140
Q

What is the write-ahead log protocol related to and what does it state?

A

Relates to immediate update recovery technique and states that it is essential that log records are written before write to DB

141
Q

Describe how shadow paging works. (5 total)

A
  • Two page tables during life of a transaction: the current page and the shadow page tables
  • Both are the same at the start of a transaction
  • Shadow page never changed thereafter, used to restore DB in event of failure
  • During transaction, current page table records all updates to the DB
  • When transaction completes, current page table becomes the shadow page table
142
Q

What are standard transaction models good for? (5 total)

A

o Data that has many types, each with small number of instances.
o Designs may be very large.
o Design is not static but evolves through time.
o Updates are far‐reaching.
o Cooperative engineering.

143
Q

What problems can arise from long duration transactions in standard transaction models? (4 total)

A

More susceptible to failure
 Requiring a minimisation in the amount of work lost.

May access large number of data items
 Concurrency limited if data inaccessible for long periods.

Deadlock more likely.

Cooperative use of shared data items restricted by traditional protocols.

144
Q

List some examples of advanced transaction models. (5 total)

A
o	Nested Transaction Model
o	Sagas
o	Multi‐level Transaction Model
o	Dynamic Restructuring
o	Workflow Models
145
Q

Describe the action of a nested transaction model (9 total)

A
  • Transaction viewed as hierarchy of sub-transactions.
  • Top‐level transaction can have number of child transactions.
  • Each child can also have nested transactions.
  • Transactions have to commit from bottom upwards.
  • Transaction abort at a level doesn’t have to affect transaction in progress at higher level.
  • Parent allows to perform its own recovery
  • Updates of committed sub-transactions at intermediate levels are visible only within scope of their immediate parents.
  • Commit of sub-transaction is conditionally subject to commit or abort of its superiors.
  • Top‐level transactions conform to traditional ACID properties of flat transaction.
146
Q

In a nested transaction model, how might a parent transaction perform its own recovery? (4 total)

A

o Retry sub-transaction.
o Ignore failure, in which case sub-transaction non‐vital.
o Run contingency sub-transaction.
o Abort.

147
Q

What does Moss’s proposal state, and what model is it related to?

A

In the nested transaction model, the proposal states that only leaf‐level sub-transactions perform database operations.

148
Q

What advantages are gained through the use of a nested transaction model? (4 total)

A

Modularity
o Transaction can be decomposed into number of sub-transactions for purposes of concurrency and recovery

Finer level granularity for concurrency control and recovery

Intra-transaction parallelism

Intra-transaction recovery control

149
Q

How can nested transactions be emulated?

A

Using savepoints

150
Q

What is a savepoint in the context of advanced transaction models?

A

An identifiable point in flat transaction representing some partially consistent state

151
Q

Describe the action of savepoints in nested transaction models? (2 total)

A
  • During execution of transaction, user can establish savepoint
  • User can use this to roll transaction back
152
Q

How do savepoints and nested transaction models differ?

A

Unlike nested transactions, savepoints don’t support intra‐transaction parallelism.

153
Q

What is the definition of Relational Algebra?

A

RA operations work on one or more relations to define another without changing the original relations

154
Q

In RA, what is meant by the closure property?

A

Expressions are nested, just as in arithmetic

155
Q

In RA, what does the Selection operation do?

A

Works on a single relation R to define a relation that contains only tuples of R that satisfy the specified condition

σpredicate (R)

156
Q

In RA, what does the Projection operation do?

A

Works on a single relation R to define a relation that contains a vertical subset of R.

Extracts values of specified attributes and eliminating duplicates

Πcol1,…,col n (R)

157
Q

In RA, what does the Union operation do? What compatibility constraint must be maintained?

A

The union of two relations R and S defines a relation containing all the tuples of R, S, or both R and S with duplicates removed

R and S must be union compatible

R∪S

158
Q

In RA, what does the Set Difference operation do? What compatibility constraint must be maintained?

A

Defines a relation consisting of the tuples that are in R but not in S

R and S must be union compatible

R-S

159
Q

In RA, what does the Intersection operation do? What compatibility constraint must be maintained?

A

Defines a relation consisting of the set of all tuples that are in both R and S

R and S must be union-compatible

R∩S

160
Q

In RA, what does the Cartesian Product operation do?

A

Defines a relation that is the concatenation of every tuple in relation R with every tuple of relation S

R×S

161
Q

What two RA operations can be combined to emulate a join?

A

Selection with Cartesian Product

162
Q

Which form of operation is the most challenging to implement efficiently? What effect does this have on RDBMSs?

A

Join operations

Causes RDBMS to have intrinsic performance issues

163
Q

In RA, what does the Theta Join operation do?

A

Defines a relation that contains tuples satisfying the predicate F from the Cartesian product of R and S

R ⋈F S

164
Q

In RA, what does the Natural Join operation do?

A

An Equijoin of the two relations R and S over all common attributes x

One occurrence of each common attribute is eliminated from the result

R⋈S

165
Q

In RA, what does the Natural Join operation do? What two forms of Natural Join exist and explain there actions?

A

Used to display rows in the result that do not have matching values in the join column

Left join is a join in which tuples from R that do not have matching values in common columns of S are also included in result relation

R⋊S

Right join is a join in which tuples from S that do not have matching values in common columns of R are also included in result relation

R⋉S

166
Q

In RA, what does the Semi Join operation do?

A

Defines a relation that contains the tuples of R that participate in the join of R with S

R ⊳F S

167
Q

In RA, what does the Division operation do?

A

Defines a relation over the attributes C that consists of set of tuples from R that match combination of every tuple in S

R÷S

168
Q

In RA, what do Aggregate Operations do?

A

Applies aggregate function, AL, to R to define a relation over the aggregate list

SQUIGGLEY-F subscript AL (R) where AL contains one or more pairs

169
Q

In RA, what do Grouping Operations do? (3 total)

A
  • Groups tuples of R by grouping attributes, GA
  • Then applies aggregate function list, AL, to define a new relation
  • Resulting relation contains the grouping attributes, GA, along with results of each of the aggregate functions

[subscript GA SQUIGGLEY-F subscript AL] (R) where AL contains one or more pairs

170
Q

What are the two main techniques for query optimisation?

A

o Heuristic rules that order operations in a query

o Comparing different strategies based on relative costs, and selecting one that minimises resource usage

171
Q

What is the dominant cost in query processing for centralised DBMSs?

A

Disk accesses

172
Q

Define Query Processing

A

The activities involved in retrieving data from the DB

173
Q

What are the two main aims of Query Processing?

A

Transform query written in high-level language into correct and efficient execution strategy expressed in low-level language

Execute strategy to retrieve required data

174
Q

What is query optimisation?

A

The activity of choosing an efficient execution strategy for processing a query

175
Q

Why is Query Optimisation necessary?

A

As there are many equivalent transformations of same high-level query, aim of QO is to choose one that minimises resource usage

176
Q

What two factors are reduced through Query Optimisation?

A

Reduce total execution time of a query

Reduce response time of query

177
Q

What are the four phases of Query Optimisation?

A

o Decomposition, consisting of parsing and validation
o Optimisation
o Code generation
o Execution

178
Q

How/when can the first three phases of Query Optimisation be carried out? (2 total)

A

o Dynamically every time query is run

o Statically when query is first submitted

179
Q

List an advantage and disadvantage of Dynamic query Optimisation.

A

Advantages arise from fact that information is up to date

Disadvantages is performance of query affected, time may limit finding optimum strategy

180
Q

Name an advantage and disadvantage of Static query Optimisation.

A

Advantages are removal of runtime overhead, and more time to find optimum strategy

Disadvantages arise form fact that chosen execution strategy may no longer be optimal when query is run

181
Q

How can the disadvantages of Dynamic and Static Query Optimisation be combatted?

A

By adopting a hybrid approach

182
Q

Define the process of Query Decomposition

A

Transformation of a high-level query into RA query and checks syntactic and semantic correctness

183
Q

What are the five stages of Query Decomposition?

A

Analysis

Normalisation

Semantic Analysis

Simplification

Query Restructuring

184
Q

Describe the Analysis phase of Query Decomposition

A
  • Analyse query lexically and syntactically using compiler techniques
  • Verify relations and attributes exist
  • Verify operations are appropriate for object type
  • Query then transformed into some internal representation more suitable for processing
  • Query tree chosen
185
Q

Describe the workflow for constructing a Query Tree (3 total)

A
  • Leaf node created for each base relation
  • Non-leaf node created for each intermediate relation produced by RA operation
  • Root of tree represents query result

-> Sequence is directed from leaves to root

186
Q

What happens during the Normalisation phase of Query Decomposition?

A

Query converted into a normalised form for easier manipulation

187
Q

What two forms can the predicate be converted into during the Normalisation phase of Query Decompositon?

A

o Conjunctive Normal form

o Disjunctive Normal Form

188
Q

What happens during the Semantic Analysis phase of Query Decomposition?

A

Following normalisation, normalised queries that are incorrectly formulated or contradictory are rejected

189
Q

What defines an incorrectly formulated query?

A

Its components do not contribute to generation of result

190
Q

What defines a contradictory query?

A

Its predicate cannot be satisfied by any tuple

191
Q

How can we test for incorrectly formulated queries?

workflow with 3 stages and a summary

A

Through the construction of a Relation Connection Graph

  1. Create node for each relation and node for result
  2. Create edges between two nodes that represent a join
  3. Add edges between nodes that represent projection

If not connected, query is incorrectly formulated

192
Q

What is the purpose of the Simplification stage of Query Decomposition? (3 total)

A
  • Detects redundant qualifications
  • Eliminates common sub-expressions
  • Transforms query to semantically equivalent but more easily/efficiently computed form
193
Q

What considerations and assumptions are made during Query Decomposition?

A

Typically access restrictions, view definitions and integrity constraints are considered

Assumption that user has appropriate access privileges

194
Q

What are the rules of Heuristic Processing strategies? (5 total)

A

Perform selection operations as early as possible:
o Keep predicates on same relation together

Combine Cartesian Product with subsequent selection whose predicate represents join condition into a join operation

Use Associativity of binary operations to rearrange leaf nodes so leaf nodes with most restrictive selection operations are executed first

Perform projection as early as possible:
o Keep projection attributes on same relation together

Compute common expressions once:
o If common expression appears more than once and result not too large, store result and reuse it when require
o Useful when querying views, as same expression is used to construct view each time

195
Q

What problems are potentially faced by a DBMS when storing statistics? (2 total)

A

If stats updated every time tuple is changed, performance would be impacted

DBMS may update stats on a periodic basis or whenever system idle

196
Q

READ CHAPTERS 5 THROUGH 7

A

UNDERSTAND MATHEMATICS AND RA