Mass Storage (11) Flashcards

Question 1

Q

HDD

Answer

A

HDDs spin platters of magnetically-coated meterial under a moving head
- drives speed 60/250 RPM
- transfer rate: rate of data flow between drive and computer (around 1 Gb/sec)
- positioning time (random access time): is time to move disk arm to desired cylinder (seek time) (from 3ms to 12ms) and time for desired sector to rotate under the disk head (rotational latency) (worst 60/RPM, avarage is 1/2 of 60/RPM)
- Access latency = average acess time = seek time + average rotational latency
- Average I/O time: average access time + (amount to transfer/ transfer rate) + controller overhead

Question 2

Q

Question 3

Q

Nonvolatile memory devices

Answer

A

Can be drive-like, include also usb-drives
Can be more reliable than hdd
are more expensive
Avg shorter life span – need careful management
Less capacity
Much faster
Busses can be too slo for them -> connect directy to PCI
No moving parts, no seek time or rotational latency
Read and written in page increments (equivalent of sectors)
Can’t be overwritten in place
- must be erased and erases happen in larger block increments
- Limited number of erases (100k)
- Life span measured in drive writes per day (DWPD)
  - 1TB NAND drive with rating of 5DWPD is expected to have 5TB per day written within the warantee period without failing.
    *

Question 4

Q

NAND Flash Controller Algorithms

Answer

A

With no overwrite the result is that often there is amix of valid and invalid data and non-functioning blocks.
Tracking valid logical blocks: controller maintains flash translation layer (FTL) table
Also implements garbage collection to free invalid page space
Allocates overporvisioning to provide space for GB
Each cell has a lifespan, so we need to try to write an equal amount of times on every cell.

Question 5

Q

Volatile memory

Answer

A

DRAM is often used as a mass storage device, even if techinically it is not a secondary storage
RAM drives present as raw block devices, commonly file system formatted.
Used as high speed temp. storage
Computers have buffering, caching via RAM, so why RAM drives?
- Caches and buffers are managed by programmer, os, hw
- We can have also user controlled ram
  - in all major os
    - Linus /dev/ram, macOS diskutil

Question 6

Q

Address mapping

Answer

A

Disk are addressed as big 1d arrays of logical blocks:

logical block: smallest unit of transfer
Low level formatting creates logical blocks on physical media

1d array mapped into the sectors of the disk sequentially:

sector 0 is frist sector of the first track on the outermost cylinder
Mapping proceedes in this order..
Logical to physical is easy
- Except for bad sectors
  - Non constant #secotrs per track via constant angular velocity

Question 7

Q

HDD Scheduling

Answer

A

The os is in charge of using hw efficiently, in such a way that access time is low and bandwidth is large.

Many sources of I/O request

OS
System processes
users processes

I/O requests are made of:

mode: input or output
disk address
memory address
number of sectors to transfer

OS has queue of the requests per disk/device, when a device is idle it can work on I/O.

Optimization only done when there is a queue

In the past the operating system was also responsible for disk drive head scheduling, now it is built in storage device controllers.

There are several possible algorithms to use with a request queue.

Question 8

Q

Disk scheduling algo: FSFS

Answer

A

Simple FIRST IN FIRST OUT

Question 9

Q

Disk scheduling algo: SCAN

Answer

A

AKA elevator algorithm.
The disk arm starts at one end of the disk,and moves toward the other end, servicing requests until it reaches the end of the disk, there the head movement is reversed.
Note that if requests are not uniformly dense, and the largest density is at the extremes, those are the sectors who wait the most.
208 head movements in illustration for the given queue

Question 10

Q

Disk scheduling algo: C-SCAN

Answer

A

CIRCULAR-SCAN
Provides more uniform wait time.
Threats the cylinders as a circular list that wraps around from the last cylinder to the first

Question 11

Q

Choosing disk scheduling algo

Answer

A

Shorter seek time first (greedy) is common
SCAN and C-SCAN are very good for systems that place heavy load on the disks
Linux implements a deadline scheduler:
- Manintains different queue for read and write
  - read has priority
- Implements foru queues: 2 x read, 2 x write
  - 1 read and 1 write queue sorte in LBA order (basically C-SCAN)
  - 1 read and 1 write queue sorted in FCFS
  - All I/O req.s are sent in batch sorted in the queue order
  - After each batch, checks if requests in FCFS older than max age(usually 500ms)
    - if so, LBA queue containing that request is selected for the next batch of I/O

Question 12

Q

NVM scheduling

Answer

A

No disk heads or rotational latency but we can still optimize
NOOP scheduling (no scheduling) is used but adjacent LBA requests are combined
- NVM best at random I/O, HDD best at sequential
- Throughput can be similar
- Much more I/O per second ( 1000x)
- Write amplification: one write can cause many reads writes because of garbage collection

Question 13

Q

Storage device management: low level formatting

Answer

A

Low level formatting/physical formatting is the process of dividing the disk into secotrs that the disk controller can read and write.

Each sector can hold an header + data + error correction codes;

Usually sector is 512B.

Question 14

Q

Storage device management: partitions, logical formatting

Answer

A

The OS needs to use disk to hold files and its data structures.

Partitions are group of cylinders, each treated like a single logical disk.

Logical formatting means creating a file-system.

Usually FS group blocks into clusters to increase efficiency:

Disk I/O: block level
File I/O: cluster level

Question 15

Q

Storage device management: boot block

Answer

A

Root partition contains OS, other partitions can hold other OS, fs, raw
- Mounted when boot happens
At the time of the mount the fs consistency is checked
- if error -> try fixing it
Boot block initializes system:
- Bootstrap is stored in rom
- Bootstrap loader is stored in boot blocks of boot partition
Boot block can point to
- Boot volume
- Boot loader: set of blocks with code to load the kernel
- Boot management program: for multi os support

Question 16

Q

Storage device management: sector sparing, raw disk access

Answer

Study These Flashcards

A

Sector sparing is a method to manage bad blocks.

Raw access to disk may be desireable for certain applications that want to manage their own block (db).

Question 17

Q

Swap-space management

Answer

Study These Flashcards

A

Used for moving entire process or pages from dram to disk, when DRAM is too full.
OS provides the capability:
- secondary storage is slower -> optimization is important
- can be on raw partition or within file system.

Question 18

Q

Storage attachment

Answer

Study These Flashcards

A

Computers access storage in 3 ways:
- host-attached:
  - local IO ports, using technologies such as USB or Optical Fibre Channel.
- network-attached storage (NAS):
  - made available over a network
  - NFS/CIFS are the protocols used
  - implemented through remote procedures calls over TCP or UDP
- cloud
  - API based, programs using the APIs to provide access

Question 19

Q

Storage array

Answer

Study These Flashcards

A

Attach disks or array of disks
Has controller(s), to provide services to hosts:
- ports
- memory, controlling sw
- RAID
- shared storage
- extra features: snapshots, clones, replication
Compared to NAS is faster because NAS uses network bandwidth

Question 20

Q

Storage area network

Answer

Study These Flashcards

A

Various hosts connected to multiple storage arrays, via fibre channel(s).

Hosts can attach to switches.

Easy to add and remove storage, also easy to give storage to new machine in the net.

Question 21

Q

RAID

Answer

Study These Flashcards

A

redundant array of inexpensive disks
- reliability provided by redundancy
increase mean time to failure
Mean time to repair: time in which another failure could destroy the system
Mean time to data loss: combines latter factors
Example:
- Mean time to failure = 100k h
- Mean time to repair = 10 h
- Mirrored RAID: 2 disks
- MTTF(1st disk) = MTTF/2 = 50k h
- Probability of second failing during repair = MTTR/MTTF = 10/100k = 10^-4
- Mean time to data loss (MTTDL) = MTTF(1st disk) / (MTTR/MTTF) = MTTF^2/(MTTR*2)
can be combined with nvram to increase perf.
striping uses a group of disks as a storage unit
mirroring: duplicates data
Other features:
- Snapshot
- Hot spare disk
Extension
- RAID doesn’t prevent or detect data corruption but just disk failures
- Solaris ZFS extends it with checksums
  - can detectr anc correct data/metadata corruption
- ZFS doesn’t support volumes or partitions, but only pools
  - malloc and free primitives

Mass Storage (11) Flashcards

(21 cards)