Module 4: Intelligent Storage Systems (RAID) Flashcards
What is RAID?
Redundant Array of Independent Disks - combines multiple drives into RAID set for protection and performance
How are RAID sets organized?
divided into multiple LUNs - each LUN connected to a host and appears as a single drive
How does RAID help with protection and performance?
protects against drive failures
serves IOs from multiple drives simultaneously to help performance
What are the techniques RAID uses to protect data loss during drive failure?
mirroring and parity
How is RAID typically implemented?
using specialized controller called RAID controller - present either on the storage or compute system
What is software RAID?
uses compute system based software to perform RAID at the OS level
What are the different RAID techniques?
striping
mirroring
parity
What is striping?
spreading LUN data across multiple drives to use the drives in parallel
all read/write heads work simultaneously - allows more data to be processed in shorter time
What is the set of strips across drives in striping called?
stripe - stripe size is the number of blocks in a strip - every strip in stripe must have same number of blocks
For example: in a four-disk striped RAID set with a strip size of 64 KB, the stripe size is 256 KB (64 KB x 4)
What is the reason for striping?
high performance - no protection
What happens if a drive fails during striping only?
no way to get that data back so whole LUN is considered unusable
What is mirroring?
where same data is stored on two or more disks resulting in multiple copies of the same data
What happens during drive failure in mirroring?
data remains intact on the surviving disk drive - controller continues to service data requests from surviving disk
when failed disk is replaced controller copies data from surviving disk to new one
What is a drawback of mirroring?
very expensive - preferred for mission critical apps where data loss can’t be afforded
write performance slows since it needs to write twice
What is a benefit of mirorring?
full protection from failure
read performance improves since reads can be serviced by both disks in pair
What is parity?
method to protect striped data from drive failure w/o cost of mirroring
parity = mathematical construct that allows recreation of missing data
What is the advantage of parity?
ensures protection of data without maintaining full set of duplicate date
What is a downfall of parity?
RAID controller needs to do more work than for mirroring - has to do drive rebuilds by making calculations which slows rebuild times
not as safe as full mirror due to longer rebuild times
RAID 0?
striping
RAID 1?
mirroring
RAID 1/0?
mirroring + striping
high protection but expensive
RAID 5?
striped set w/ independent disk access and distributed parity
RAID 6?
striped set w/ independent disk access and dual distributed parity
What are the factors in choosing RAID set?
app performance
data availability requirements
cost
What is the minimum RAID drive number for RAID 1/0?
4 - must always be an even number
What happens if both disks in the same mirrored pair fail in RAID 1/0?
data will be lost since there is no parity in this RAID level
How does RAID 10 work?
data first mirrored over each pair and then both copies striped across all the drives in the set
How do you replace a drive in RAID 1/0?
only the mirror is rebuilt - storage controller uses surviving disk in mirrored pair to recover
What are the minimum number of disks in RAID 6?
4 - can take up to two disk failures instead of one like RAID 5
What is the impact on performance of RAID 5?
every disk write manifests as 4 IOs - two reads two writes
What is the impact on performance of RAID 6?
every disk write manifests as 6 IOs - 3 reads 3 writes
What is the impact on performance of RAID 1 and RAID 1/0?
every disk write manifests as 2 IOs - 2 writes
What is a write penalty in RAID?
every write operation translates into more IO overhead for the disks - happens in mirroring/parity RAID groups
How is data written in a parity RAID 5 configuration?
new data comes in - parity computed by reading old parity and old data - manifests 2 reads
after new parity computed controller completes write IO by writing the new data and new parity onto the disk - manifests 2 writes
write penalty = 4
How is data written in a parity RAID 6 configuration?
disk write requires 3 reads - 2 parity and 1 data
calculated both new parities and controller performs 3 writes - 2 parity and 1 data
What is a Nested RAID Level?
combining RAID Levels together - RAID 1/0
What do most RAID configurations have?
Hot spares - can take over in case of drive failure in the RAID group
What does a RAID controller do?
controls the RAID - handles all r/w operations between host and front end and r/w operations between back end and disk
What is something always true about RAID controllers?
will always have two for sake of redundancy
What helps w/ RAID controller performance?
has certain number of front end ports to connect and certain amount of cache to differ r/w requests to immediately handle requests to disk
What is a LUN?
logical slice of a single disk
host treats it as its own drive w/ specific space - has no copy on another drive so its a single point of failure
How is LUN made?
LUN partioned
goes through allocation process and assigns it to the front end ports via mapping
LUN masking process take the LUN to the physical host
What happens when the hosts writes to a LUN?
first writes the block to cache - once cache block is filled will write back to host saying operation has completed
allows more blocks to be written to cache for fast performance - eventually once all cache blocks are filled RAID controller wil ldump to disk
How does the RAID controller relate to LUNs?
aware LUN is unprotected on its own - will dump data from cache to unprotected LUN since in no RAID setup its the only place it can go
Cons of no RAID setup?
if drive fails than its full data loss
performance based off a single drive so can cause bottlenecking
What is a RAID Group?
logical group of one or more drives you eventually want to slice into LUNs
How does the hot spare process works?
RAID controller takes all data written onto alive drive and copies it back to cache
drive sitting there doing nothing gets placed into RAID group - RAID controller copies data from cache to new drive
What is RAID 5 best for?
random read and writes
What is RAID 3 best for?
sequential read and writes