- mirroring - can recover form single-disk failure - requires 2N disks

- dedicated parity disk - can recover from single-disk failure - requires N+1 disk - performance benefits if you stripe a single file across multiple data disks - all writes hit the parity disk

- spread out parity - can recover from single-disk failure - requires N+1 disk - performance benefits if you stripe a single file across multiple data disks - writes are spread across disks

Long Exam 2 - Distributed Systems Flashcards by Jose Adolfo Talactac

What is a distributed system?

A collection of autonomous computing elements that appear as a single coherent system with autonomous computing elements (nodes).

How well did you know this?

Not at all

Perfectly

What is it meant by a collection of autonomous nodes?

Each node is autonomous and has its own notion of time without a global clock. This however leads to fundamental synchronization and coordination problems.

How well did you know this?

Not at all

Perfectly

What is an overlay network?

Each node in the collection communicated only with the other nodes in the system.

How well did you know this?

Not at all

Perfectly

What are the two types of overlay networks?

Structured (well-defined set of neighbors through trees and rings) and unstructured (randomly select other nodes)

How well did you know this?

Not at all

Perfectly

What are the four goals of a distributed system?

sharing of resources
distribution transparency
openness
scalability

How well did you know this?

Not at all

Perfectly

What are the three types of scalability?

size scalability
geographical scalability
administrative scalability

How well did you know this?

Not at all

Perfectly

How to design fault-tolerant systems?

Identify all possible faults
Detect and contain the fault
Handle the fault

How well did you know this?

Not at all

Perfectly

What is the acronym RAID for?

Redundant Array of Inexpensive Disks

How well did you know this?

Not at all

Perfectly

What is RAID 1?

mirroring
can recover form single-disk failure
requires 2N disks

How well did you know this?

Not at all

Perfectly

What is RAID 4?

dedicated parity disk
can recover from single-disk failure
requires N+1 disk
performance benefits if you stripe a single file across multiple data disks
all writes hit the parity disk

How well did you know this?

Not at all

Perfectly

What is RAID 5?

spread out parity
can recover from single-disk failure
requires N+1 disk
performance benefits if you stripe a single file across multiple data disks
writes are spread across disks

How well did you know this?

Not at all

Perfectly

What is isolation?

Occurs either completely before or completely after every other concurrent threads

How well did you know this?

Not at all

Perfectly

What is the golden rule to achieve atomicity?

Never modify the only copy.

How well did you know this?

Not at all

Perfectly

How to make renaming shadow copies atomic?

By using single-sector writes.

How well did you know this?

Not at all

Perfectly

What is a shadow copy?

Shadow copies work because they perform updates/changes on a copy and automatically install a new copy using an atomic operation

How well did you know this?

Not at all

Perfectly

What are the shortcomings of shadow copies?

Study These Flashcards

Hard to generalize to multiple files/directories
Require copying the entire file for even small changes
Haven’t even dealt with concurrency

What are transactions?

Study These Flashcards

Transactions provide both atomicity and isolation. Each transaction will appear to have run to completion or not at all. When multiple transactions are run concurrently, it will appear as if they were run sequentially.

What are the three types of records used in a log?

Study These Flashcards

UPDATE records include old and new values of a variable. COMMIT records specify that transaction committed. ABORT records specify that transaction aborted.

What is the drawback of using cell storage for logging?

Study These Flashcards

The writes are okay but we write to disk twice instead of once. Recover is also slow as we have to scan the entire log.

What is the drawback for using cache for logging?

Study These Flashcards

Recovery takes longer as the log grows. Truncating the log may help by flushing all cached updates to cell storage and writing a checkpoint record.

When does two operations conflict?

Study These Flashcards

Two operations conflict if they operate on the same object and at least one of them is a write.

What is conflict serializability?

Study These Flashcards

A schedule is conflict serializable if the order of all of its conflict is the same as the order of the conflict in some sequential schedule.

What is two-phase locking?

Study These Flashcards

Each shared variable has a lock
Before any operation on a variable, the transaction must acquire the corresponding lock
After a transaction releases a lock, it may not acquire any other locks

What are two phases in two-phase locking?

Study These Flashcards

Acquire phase, where transactions acquire locks. New locks on items can be acquired but none can be released;
Release phase, where transactions release locks. existing locks can be released but no new locks can be acquired.

How to address the possibility of deadlocking in two-phase locking?

Take advantage of atomicity and abort one of the transactions by using victim selection, typically avoiding the transaction that have been running for a long time.

What are reader and writer locks?

Multiple transaction can hold reader locks for the same variable at once but only one transaction can hold a write lock for a variable.

What are the two phases for two-phase commit?

Prepare – all tasks should be completed before sending prepare Commit – all prepares should be ACKed before sending commit

What to do if workers fail after commit point?

Recovery from crash.

What is consistency?

All clients see the same data at the same time, no matter which node they connect to.

What is strong consistency?

Whenever data is written to one node, it must be instantly forwarded or replicated to all the other nodes in the system before the write is deemed ‘successful’

What is the CAP theorem by Eric Brewer?

Any distributed data store can provide only two of the following three guarantees: Consistency, Availability, Partition Tolerance

What is ACID?

Atomicity, consistency, isolation, and durability.

What is a view server?

A view server determines which replica is the primary. All requests go through from the coordinators to the view server.

What happens if view server fails?

Election for a new view server

What are the six consistency guarantees?

Long Exam 2 - Distributed Systems Flashcards

(35 cards)