Mass-Storage Flashcards
What is the structure of a disk?
One-dimensional arrays of logical blocks (logic_blocks[])
Data addresses with logical block number (index)
What makes a good disk?
- Speed (bandwidth)
- Speed (Access time, or latency)
- Reliability
- Power
- Cost
- Capacity
What is disk scheduling?
Queue of pending requests created when multiple processes use the disk at once.
What affects the performance of disks when it comes to disk scheduling?
Order in which requests are serviced.
Minimisation of head movement.
Fairness
What are the advantages of FCFS Scheduling?
Simple
Fair - no chance of starvation
What are the disadvantages of FCFS scheduling?
May involve excessive head movement as it doesn’t take into account other requests in the queue.
What is SSTF Scheduling?
Shortest Seek-Time First Scheduling.
Next request processed is the one with the shortest seek time.
Try to service all requests in local area before moving, pick request with closest cylinder number.
What are the disadvantages of SSTF Scheduling?
Starvation can occur. A request can be held up indefinitely if it far from local area being processed.
What is SCAN scheduling?
Head continually scans disk from one end to the other and back (Elevator algorithm)
Requests are serviced as the head passes. Fair.
Non-uniform delays
What is C-SCAN scheduling?
Variant on SCAN, returns to start immediately on reaching end (circular).
Area with most requests is start.
What is LOOK / C-LOOK Scheduling?
Variants on SCAN / C-SCAN, only goes as far as final requests in each direction (IE Bounded)
How do SSDs deal with requests?
FCFS policy, can’t use other scheduling policies as it doesn’t have a moving head.
In what time are Reads and Writes on SSDs?
Read: Uniform
Write: Non-uniform
Some SSD schedulers merge only adjacent write requests.
How do Block interfaces to SSD’s work?
Logical only.
SSDs run software that manage wear on components. Requesting a particular block doesn’t guarantee a hardware block address.
What is deduplication?
Don’t store things twice on the drive.
How do we check for duplication?
We use hashing, on both a per-file basis and block basis.
What are the issues with per-file hashing?
Adding anything to a file breaks the hash similarity
Is block hashing better than file hashing?
Yes but not in all cases.
Inserting any data not the size of a block breaks the similarity.
Requires more computational power.
How does variable length hashing work?
Uses a rolling hash.
Resyncs the hash boundaries because the data is the same.
Shift resistant hashing.
What issue exists with variables length hashing?
Hard to parallelise.
What is RAID?
Many disks attached to a computer system.
Improves Read/Write performance and reliability.
Redundant Array of Inexpensive Disks
What is the effect of Redundancy in raid?
Improves reliability as >1 copy of data.
However, this causes a storage overhead
What is mirroring?
Duplicate each disk, single logical disk consists of 2 mirrored physical disks. Write to both, read from one.
How does RAID affect parallelism?
Read request service rate doubled with mirroring, request can be serviced from either copy.
Transfer rate increased through striping.
What is data striping
Data fragmented and each fragment written o a different disk.
Increases transfer rate, can read from both.
What do each of the different RAID levels mean?
Different combinations of mirroring, striping and parity techniques
Cost/performance trade offs different for each
What are the attributes of RAID 0?
Block-level striping, no redundancy but improved performance. Min 2 disks.
What are the attributes of RAID 1?
Disk mirroring.
Min 2 disks.
- High storage overhead.
What are the attributes of Raid 2?
Error Correcting Code organisation, correct single-bit errors and detect double-bit errors.
EEC stored on additional disks.
What happens on read with a Raid 2 organisation?
Ever disk takes part in every I/O request. Data and ECC bits delivered to controller on read.
What are the attributes of Raid 3?
Bit-interleaved parity.
Raid 2 but only single parity disk. Can detect read errors
What are the attributes of Raid 4?
Block interleaved parity. Parity block stored on additional disk.
Block access involves a single disk, multiple I/O requests serviced in parallel.
What are the attributes of Raid 5?
Block-interleaved distributed parity. Parity blocks distributed among disks.
Avoid bottleneck of single parity disk.
What are the attributes of Raid 6?
Raid 5 but with two parity calculations, meaning it can handle two disk falilures.