RAID Setup in Linux Flashcards by S R

RAID stands for?

Was Redundant Array of Inexpensive disks
but now
Redundant Array of Independent drives
RAID is a collection of disks in a pool to become a logical volume

How well did you know this?

Not at all

Perfectly

Logical volume

A logical disk, logical volume or virtual disk (VD[1] or vdisk[2] for short) is a virtual device that provides an area of usable storage capacity on one or more physical disk drive(s) in a computer system.
The disk is described as logical or virtual because it does not actually exist as a single physical entity in its own right.
The goal of the logical disk is to provide computer software with what seems a contiguous storage area, sparing them the burden of dealing with the intricacies of storing files on multiple physical units.
Most modern operating systems provide some form of logical volume management.
https://en.wikipedia.org/wiki/Logical_disk

How well did you know this?

Not at all

Perfectly

A combine of drivers make a group of disks to form a RAID ____ or RAID ____

RAID contains groups or sets or Arrays

How well did you know this?

Not at all

Perfectly

Explain the concept of parity

Parity method in raid regenerate the lost content from parity saved information’s. RAID 5, RAID 6 Based on Parity

How well did you know this?

Not at all

Perfectly

Explain the concept of striping

In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices.
An example of data striping. Files A and B, of four blocks each are spread over disks D1 to D3.

Striping is useful when a processing device requests data more quickly than a single storage device can provide it. By spreading segments across multiple devices which can be accessed concurrently, total data throughput is increased. It is also a useful method for balancing I/O load across an array of disks. Striping is used across disk drives in redundant array of independent disks (RAID) storage, network interface controllers, disk arrays, different computers in clustered file systems and grid-oriented storage, and RAM in some systems.

How well did you know this?

Not at all

Perfectly

Explain the concept of mirroring

Mirroring is used in RAID 1 and RAID 10. Mirroring is making a copy of same data. In RAID 1 it will save the same content to the other disks in the set/array

How well did you know this?

Not at all

Perfectly

Explain the concept of Hot Spare

Hot spare is used in an array to automatically replace a failed drive. If any one of the drives fail in our array this hot spare drive will be used and rebuild automatically.

How well did you know this?

Not at all

Perfectly

Explain the concept of Chunks

Chunks are just a size of data which can be minimum from 4KB and more. By defining chunk size we can increase the I/O performance

How well did you know this?

Not at all

Perfectly

RAID 0

(also known as a stripe set or striped volume) splits (“stripes”) data evenly across two or more disks, without parity information, redundancy, or fault tolerance.
https://en.wikipedia.org/wiki/Standard_RAID_levels#RAID_0

How well did you know this?

Not at all

Perfectly

RAID 0 benefits

This configuration is typically implemented having speed as the intended goal.[2][3] RAID 0 is normally used to increase performance, although it can also be used as a way to create a large logical volume out of two or more physical disks.

A RAID 0 array of n drives provides data read and write transfer rates up to n times as high as the individual drive rates, but with no data redundancy. As a result, RAID 0 is primarily used in applications that require high performance and are able to tolerate lower reliability, such as in scientific computing[5] or computer gaming.[6]

Some benchmarks of desktop applications show RAID 0 performance to be marginally better than a single drive.[7][8] Another article examined these claims and concluded that “striping does not always increase performance (in certain situations it will actually be slower than a non-RAID setup), but in most situations it will yield a significant improvement in performance”.[9][10] Synthetic benchmarks show different levels of performance improvements when multiple HDDs or SSDs are used in a RAID 0 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison

How well did you know this?

Not at all

Perfectly

RAID 0 Drawbacks

Since RAID 0 provides no fault tolerance or redundancy, the failure of one drive will cause the entire array to fail; as a result of having data striped across all disks, the failure will result in total data loss.

Another article examined these claims and concluded that “striping does not always increase performance (in certain situations it will actually be slower than a non-RAID setup), but in most situations it will yield a significant improvement in performance”.[9][10] Synthetic benchmarks show different levels of performance improvements when multiple HDDs or SSDs are used in a RAID 0 setup, compared with single-drive performance. However, some synthetic benchmarks also show a drop in performance for the same comparison

How well did you know this?

Not at all

Perfectly

RAID 1

consists of an exact copy (or mirror) of a set of data on two or more disks; a classic RAID 1 mirrored pair contains two disks.

How well did you know this?

Not at all

Perfectly

RAID 1 benefits

This layout is useful when read performance or reliability is more important than write performance or the resulting data storage capacity.[13][14]
The array will continue to operate so long as at least one member drive is operational.

Any read request can be serviced and handled by any drive in the array; thus, depending on the nature of I/O load, random read performance of a RAID 1 array may equal up to the sum of each member’s performance,[a] while the write performance remains at the level of a single disk. However, if disks with different speeds are used in a RAID 1 array, overall write performance is equal to the speed of the slowest disk

How well did you know this?

Not at all

Perfectly

RAID 1 drawbacks

This configuration offers no parity, striping, or spanning of disk space across multiple disks, since the data is mirrored on all disks belonging to the array, and the array can only be as big as the smallest member disk.

overall write performance is equal to the speed of the slowest disk

How well did you know this?

Not at all

Perfectly

RAID 2

which is rarely used in practice, stripes data at the bit (rather than block) level, and uses a Hamming code for error correction. The disks are synchronized by the controller to spin at the same angular orientation (they reach index at the same time[16]), so it generally cannot service multiple requests simultaneously.[17][18]

How well did you know this?

Not at all

Perfectly

RAID 2 benefits

Study These Flashcards

Depending on the high rate Hamming code, many spindles would operate in parallel to simultaneously transfer data so that “very high data transfer rates” are possible[19] as for example in the DataVault where 32 data bits were transmitted simultaneously.

RAID 2 drawbacks

Study These Flashcards

The disks are synchronized by the controller to spin at the same angular orientation (they reach index at the same time[16]), so it generally cannot service multiple requests simultaneously.[17][18]

With all hard disk drives implementing internal error correction, the complexity of an external Hamming code offered little advantage over parity so RAID 2 has been rarely implemented; it is the only original level of RAID that is not currently used.[17][18]

RAID 3

Study These Flashcards

RAID 3, which is rarely used in practice, consists of byte-level striping with a dedicated parity disk.

The requirement that all disks spin synchronously (in a lockstep) added design considerations that provided no significant advantages over other RAID levels. Both RAID 3 and RAID 4 were quickly replaced by RAID 5.[20] RAID 3 was usually implemented in hardware, and the performance issues were addressed by using large disk caches.[18]

RAID 3 benefits

Study These Flashcards

This makes it suitable for applications that demand the highest transfer rates in long sequential reads and writes, for example uncompressed video editing. Applications that make small reads and writes from random disk locations will get the worst performance out of this level.[18]

RAID 3 drawbacks

Study These Flashcards

One of the characteristics of RAID 3 is that it generally cannot service multiple requests simultaneously, which happens because any single block of data will, by definition, be spread across all members of the set and will reside in the same physical location on each disk. Therefore, any I/O operation requires activity on every disk and usually requires synchronized spindles.

RAID 4
Diagram: where “.” equates to parity
Groups| Device #1 | Device #2 | Device #3 | Device #4 |
——————————————————
1 | A1 | A2 | A3 | A. |
2 | B1 | B2 | B3 | B. |
3 | C1 | C2 | C3 | C. |
4 | D1 | D2 | D3 | D. |

Study These Flashcards

RAID 4 consists of block-level striping with a dedicated parity disk.

In diagram 1, a read request for block A1 would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1.

RAID 4 benefits

Study These Flashcards

As a result of its layout, RAID 4 provides good performance of random reads, while the performance of random writes is low due to the need to write all parity data to a single disk.[21]

RAID 4 drawbacks
Diagram: where “.” equates to parity
Groups| Device #1 | Device #2 | Device #3 | Device #4 |
——————————————————
1 | A1 | A2 | A3 | A. |
2 | B1 | B2 | B3 | B. |
3 | C1 | C2 | C3 | C. |
4 | D1 | D2 | D3 | D. |

Study These Flashcards

In diagram 1, a read request for block A1 would be serviced by disk 0. A simultaneous read request for block B1 would have to wait, but a read request for B2 could be serviced concurrently by disk 1.

RAID 5

Study These Flashcards

RAID 5 consists of block-level striping with distributed parity. Unlike in RAID 4, parity information is distributed among the drives. It requires that all drives but one be present to operate. Upon failure of a single drive, subsequent reads can be calculated from the distributed parity such that no data are lost.[5] RAID 5 requires at least three disks.[22]

RAID 5 benefits

In comparison to RAID 4, RAID 5's distributed parity evens out the stress of a dedicated parity disk among all RAID members. Additionally, write performance is increased since all RAID members participate in the serving of write requests. Although it will not be as efficient as a striping (RAID 0) setup, because parity must still be written, this is no longer a bottleneck.[23]

RAID 5 drawbacks (minor)

Since parity calculation is performed on the full stripe, small changes to the array experience write amplification[citation needed]: in the worst case when a single, logical sector is to be written, the original sector and the according parity sector need to be read, the original data is removed from the parity, the new data calculated into the parity and both the new data sector and the new parity sector are written.

RAID 6

RAID 6 extends RAID 5 by adding another parity block; thus, it uses block-level striping with two parity blocks distributed across all member disks.[24] According to the Storage Networking Industry Association (SNIA), the definition of RAID 6 is: "Any form of RAID that can continue to execute read and write requests to all of a RAID array's virtual disks in the presence of any two concurrent disk failures. Several methods, including dual check data computations (parity and Reed-Solomon), orthogonal dual parity check data and diagonal parity, have been used to implement RAID Level 6."[25]

RAID 6 benefits

RAID6 Arrays have a 2 disk fault tolerance

RAID 6 drawbacks

RAID 6 does not have a performance penalty for read operations, but it does have a performance penalty on write operations because of the overhead associated with parity calculations. Performance varies greatly depending on how RAID 6 is implemented in the manufacturer's storage architecture—in software, firmware, or by using firmware and specialized ASICs for intensive parity calculations. RAID 6 can read up to the same speed as RAID 5 with the same number of physical drives.[26]

What are Nested RAID?

Combinations of two or more standard RAID levels.

Give some examples of Nested RAID

``` RAID 0+1 or RAID 01 RAID 0+3 or RAID 03 RAID 1+0 or RAID 10 RAID 5+0 or RAID 50 RAID 6+0 or RAID 60 RAID 10+0 or RAID 100 ```

What becomes the most significant bottleneck for Hardware RAID implementations of fast disks combined into any of the following configurations: RAID 0, RAID 1, RAID 10, and RAID 5 ?

The RAID controller can be a significant bottleneck in building a RAID system with high speed SSDs. *In measurement of the I/O performance of five filesystems with five storage configurations—single SSD, RAID 0, RAID 1, RAID 10, and RAID 5 it was shown that F2FS on RAID 0 and RAID 5 with eight SSDs outperforms EXT4 by 5 times and 50 times, respectively.

JBOD

"Just a bunch of disks" refers to non-RAID drive architecture. Other examples include SPAN/BIG and MAID (massive array of idle disks).

WIP - add Non-standard RAID levels

https://en.wikipedia.org/wiki/Non-standard_RAID_levels

WIP - add more Non-RAID drive architectures

https://en.wikipedia.org/wiki/Non-RAID_drive_architectures

WIP - create a new deck

https://raid.wiki.kernel.org/index.php/A_guide_to_mdadm

RAID Setup in Linux Flashcards

(36 cards)