Distributed Data Concepts Flashcards

1
Q

Transaction

A

Completes changes successfully or not at all. This lets multiple users access the database without fear of data inconsistencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Rollback

A

Undoes all changes in a transaction due to an error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or false: A transaction can only have one read operation and one write operation

A

False! A transaction can have as many operations as needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Online Transaction Processing (OLTP)

A

Real-time access to data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Online Analytical Processing (OLAP)

A

Involves fewer, more intensive transactions (e.g. banks processing large amounts of money) than OLTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Lost update

A

When multiple users attempt a transaction at the same time, only one completes successfully

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Dirty read

A

Data loss (e.g. due to a power cut)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or false: In the event of data loss, data is always reset to its original value

A

False! Whether the data is reset depends on whether the changes made were committed beforehand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Inconsistent analysis

A

One user reads data while another user is updating it, resulting in a mix of old and new values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Locking

A

Transactions lock part of the database before updating via a shared lock (read only) or an exclusive lock (read/write)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Timeout

A

A transaction rolls back after a certain amount of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Deadlock detection

A

The smallest transaction is rolled back

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Distributed databases

A

Data is stored in different physical locations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True or false: Distributed databases can lead to performance issues

A

True! More locations = higher risk of performance issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

CAP Theorem

A

Consistency - transactions ensure this by rolling back if an error occurs
Availability - every query request gets a response
Partition tolerance - coping with network failures/delays

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

True or false: A good database needs all CAP aspects

A

False! According to Brewer’s Theorem, only 2 CAP aspects are needed at one time (one being partition tolerance).

17
Q

Why is partition tolerance necessary?

A

Occasional network issues is better than data anomalies

18
Q

Logging

A

Transactions are logged, containing original and new values in case of rollback

19
Q

Checkpoint

A

Logs which transactions are running, committed, etc.

20
Q

What are some issues that can damage a database?

A

o Sudden crash or loss of power to servers
o Hardware problems (e.g. corrupt/broken disks)
o Flood/fire/etc. in server room
o Accidents
o Malicious damage

21
Q

True or false: In the event of damage, all incomplete transactions must be undone

A

True! These transactions will have neither COMMIT nor ROLLBACK in the log.

22
Q

True or false: If the database is damaged, completed transactions must be undone

A

False! Completed transactions (that weren’t marked as complete at the last checkpoint) are redone in case the DBMS didn’t finish writing the changes to the disk.

23
Q

Where should archives and log files be stored?

A

At a different location to the database, preferably not on a server

24
Q

Why is indexing important?

A

It makes query processing more efficient; without it, a DBMS would have to search a whole table to look for data.

25
Q

Clustered index

A

Controls how data is stored on a disk

26
Q

Non-clustered index

A

Contains a pointer (row locator) to each row