Distributed data concepts Flashcards

1
Q

what is a transaction

A

a logical unit of work which either completes in its entirety or not at all, its aim is to keep the database consistant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what do transactions end with

A

COMMIT – commit the changes and successfully end
ROLLBACK – an error has occurred so undo everything from the transaction
Programmatic SQL – normal termination of the program is a COMMIT
Programmatic SQL – abnormal/error termination is a ROLLBACK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ACID properties of a transaction

A

Atomicity, Consistency, Isolation, Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does Atomicity mean

A

completes in its entirety or not at all

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what does consistency mean

A

the database must be in a consistent state - a DBMS or application developers need to ensure this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what does Isolation mean

A

each transaction executes independently of others

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does durability mean

A

changes made by a transaction must persist - recovery systems must ensure this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

OLTP

A

Online transaction processing

Quick real-time access to data to read or modify it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

OLAP

A

Online Analytical Processing

involves fewer, more intensive transactions than OLTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Problems with transactions

A

The lost update problem
The dirty read or uncommitted dependency problem
the inconsistent analysis problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The lost update problem

A

an update done to a data item by a transaction is lost as it is overwritten by the update done by another transaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The dirty read problem (uncommitted dependancy problem)

A

A dirty read occurs when a transaction reads data that has not yet been committed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

The inconsistent analysis problem

A

when one user is reading the several pieces of data when another is updating them data the user reading the data may end up with a mix of old and new values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How to fix the problems with transactions

A

making all transactions run serially, this causes performance issues if only one transaction is run at once
read operations can be run in parallel
if a transaction is writing to one part of the database, it can still access other parts which arent affected by the update

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is locking

A

transactions lock part of the database before updating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a shared lock

A

the data item can only be read by the transaction

17
Q

what is an exclusive lock

A

read or write

18
Q

what is a deadlock

A

it is possible that 2 transactions are each waiting for the other to release a lock

19
Q

how can a deadlock be resolved

A

timeouts - transaction rolls back after a certain amount of time
deadlock detection - rollback the transaction which would cost the least to stop
deadlock prevention - try look for the problem in advance (not common as is tricky to do)

20
Q

what is a distributed database

A

the data is stored in different physical locations but the DBMS makes it invisible to the end-user
this causes more performance issues but has more storage

21
Q

CAP theorem

A

Consistency
Availability
Partition tolerance

22
Q

Consistency(CAP)

A

the database must be in a consistent state, transactions ensure this by rolling back if an error occurs when in an inconsistent state

23
Q

availability

A

every query request gets a response

24
Q

partition tolerance

A

distributed databases can cope with network failures / network delays

25
Q

Brewers theorem

A

Can only expect to have 2 of these CAP aspects at any given time, partition tolerance is a necessity, therefore designers have to trade-off between consistency and availability

26
Q

logging

A

the database keeps a log of all transactions made, including before and after values,

27
Q

checkpoints

A

these are made periodically noting which transactions are running, which are committed, etc. transactions are suspended as the checkpoint is made.
All shown as committed are fully complete and written to the disk

28
Q

archiving

A

databases are regularly archived by its administrator into offline storage

29
Q

what issues can a database be suddenly hit by

A

Sudden crash or loss of power to servers
Hardware problems (e.g. corrupt/broken disks)
Flood/fire, etc. (in the server room)
Accidental issues (e.g. bug in a program accessing the DB, the user doing something silly)
Malicious damage

30
Q

how can database be recovered

A

using the log file, as it shows what transactions have been made.

any incomplete transactions need to be undone

completed transactions will need to be redone in case the DBMS had not finished writing the changes to the disk

transactions marked as complete at the last checkpoint don’t need to be redone

Depending on the problem, the database may need to be reloaded from the previous archive before the log file entries are redone

31
Q

where should the log file be stored/archived

A

on a different storage medium to the database, archive has to be stored somewhere other than the server location