RAIDS Theory Flashcards by Gabriel Miki

What are RAIDs?

They are redundant arrays of independent discs

In other words, it is a group of independent discs that are considered as a single, large, high-performance logical disc

How well did you know this?

Not at all

Perfectly

What are RAIDs for??

Increase the performance, the size and the reliability of storage systems

How well did you know this?

Not at all

Perfectly

What are the consequences of stripping data across several discs?

Higher data, transfer rate, higher I/O, a need for load balancing across the disks

How well did you know this?

Not at all

Perfectly

What are the two octagonal techniques implemented in RAID?

Data striping, to improve performance

Redundancy, to improve reliability

How well did you know this?

Not at all

Perfectly

What is data striping?

Data striping is a method used to improve the performance and throughput of storage by distributing data across multiple disks. Here’s a more detailed explanation:

Definition: Data striping involves dividing a body of data into smaller blocks and spreading these blocks across multiple physical disks in a RAID array. This technique enhances the read and write speeds by allowing multiple disks to operate simultaneously.

How well did you know this?

Not at all

Perfectly

How does data stripe works?

How It Works:
- Data is broken down into chunks or stripes.
- Each chunk is written to a different disk in the array.
- When a file is read, the system can read different parts of the file from multiple disks simultaneously, speeding up the process.

How well did you know this?

Not at all

Perfectly

What is a stripe unit and a stripe width?

A stripe unit is the dimension of the unit of data that are written on a single disk

The stripe width this is the number of discs considered by the striping algorithm

How well did you know this?

Not at all

Perfectly

What is the main motivation for the introduction of redundancy in the RAIDs?

The fact that the more physical discs the larger the probability of failure of a disk

How well did you know this?

Not at all

Perfectly

What is the main drawback involving redundancy?

Since right operations must update also the redundant information, their performance is worse than the one of the traditional writes.

How well did you know this?

Not at all

Perfectly

What are the orthogonal techniques present in each RAID type

RAID 0: stripping only
RAID 1: mirroring only
RAID 0+1: nested levels
RAID 1+0: nested levels
RAID 4: block interleaving (redundancy, parity disc)
RAID 5: block interleaving (redundancy, distributed parity disc)
RAID 6: greater redundancy

How well did you know this?

Not at all

Perfectly

Describe the RAID 0

Data are written to a single logical disc and split in several blocks distributed across the disk according to a stripping algorithm

How well did you know this?

Not at all

Perfectly

What are the primary concerns for the RAID 0

Performance and capacity, rather than reliability

How well did you know this?

Not at all

Perfectly

What are the advantages and disadvantages of the RAID 0

Lower cost (it does not employ redundancy)

Best right performance (it does not need to update redundant data, and it is paralyzed.)

The drawback is that a single disc failure will result in data loss

How well did you know this?

Not at all

Perfectly

How would a RAID 0, four disc array with two stripes be organized, and what would be its capacity?

Capacity of 4 physical disks

        Disk 0 Disk 1Disk 2 Disk 3 Stripe1|B1 |  B2.   |. B3.|.   B4 Stripe2|B5 |. B6.   |. B7.|    B8

How well did you know this?

Not at all

Perfectly

What are the impacts of the chunk size in the disc array performance?

Smaller chunks leads to greater parallelism

Bigger chunks reduce seek time

How well did you know this?

Not at all

Perfectly

Analyze the RAID 0

Study These Flashcards

Capacity is equal to the number of discs

There is no reliability, if any drive fails Data is permanently lost. These means that the meantime to data loss (MTTDL) is equal to the meantime to failure (MTTF)

Sequential and Rand read and write operations can be fully parallelized

What is the RAID 1key idea?

Study These Flashcards

Make two copies of all data

What are the advantages and disadvantages of the RAID 1

Study These Flashcards

Advantages: High reliability, read of data (data can be retrieved from the short queue disc), first right (no error correcting code should be computed)

Disadvantages: high cost (50% of the capacity)

What is the description of RAID 0+1

Study These Flashcards

Mirroring first, then striping

What is the description of the RAID 1+0

Study These Flashcards

Stripping first then mirroring

What is the main difference between RAID 0+1 and RAID 1+0

Study These Flashcards

RAID 0+1 Fault tolerance is less

RAID 1+0 Fault tolerance is larger

What are the capacities and reliabilities of RAID 1

Study These Flashcards

The capacity is equal to the number of disks divided by two

And the reliability is in the worst case One disc, but if you’re lucky the number of discs divided by two discs can fail

What are the sequential writte and read capabilities of RAID 1

Study These Flashcards

Since half of the disks are used for copied data, there is half of the throughout, what is equal to N/2 * S

What are the capacities and reliabilities of RAID 0+1 and RAID 1+0

Study These Flashcards

What are the sequential writte and read capabilities of RAID 0+1 and RAID 1+0

What are the random read and write capacities of RAID 1

A random read can be done in parallel across all discs, and therefore is equal to N * R A write, otherwise, needs to be replicated to all discs, which results in half of the disc performance: N/2 * R

What is the difficulty of guaranteeing atomic mirrored right? And what the RAID controllers include in order to mitigate it.?

It is difficult to guarantee (ex: power failure). This way many RAIDs controllers include a write-ahead log, which is a battery backed, non-volatile storage of pending rights

Describe the RAID 4 mechanism

Disc N only stores parity information for the other N - 1 discs

How is parity updated when blocks are written?

By additive parity: where, after the update of a disc all the other contents of the other discs must be read in order to update the parity block Or by subtractive parity where you use The old data value, the new data value, and the old parity value to calculate the new parity value

What is the analysis of RAID 4

Capacity: total amount of discs minus the parity disc Reliability: one disc can fail Sequential read and write: we can parallelize across all non-parity blocks in the stripe: (N-1) * S Random read: Can be parallelized over all, but the parity disc: (N-1) * R Random writes: since the parity disc has to be updated after every write, you have to realize one read and one write in the parity disc: R/2

What is the analysis of RAID 5?

Capacity: [same as RAID 4] • N–1 Reliability: [same as RAID 4] • 1 drive can fail Sequential Read and write: • (N–1)*S[same] • Parallelization across all non-parity blocks Random Read: • N*R [vs.(N–1)*R] • Unlike RAID 4, reads parallelize over all drives Random Write: • (N/4)*R[vs.R/2 for RAID 4] • Unlike RAID 4, writes parallelize over all drives • Each write requires 2 reads and 2 write, hence N / 4

Compare all the RAID levels

See picture 4S in Comp Infra

Describe the RAID 6

More fault tolerance with respect RAID5 2 concurrent failures are tolerated Uses Solomon-Reeds codes with two redundancy schemes • (P+Q)distributedandindependent N + 2 disks required High overhead for writes (computation of parities) • each write require 6 disk accesses due to the need to update both the P and Q parity blocks (slow writes) Minimum set of 4 data disks

Best performance and most capacity? Greatest error recovery? Balance between space, performance, and recoverability?

Best performance and most capacity? -> RAID 0 Greatest error recovery? -> RAID 1 (1+0 better than 0+1) or RAID 6 Balance between space, performance, and recoverability? -> RAID 5

RAIDS Theory Flashcards

(34 cards)