P3L5: I/O Management Flashcards
What are the steps in sending/receiving a command to/from a device?
- The user makes a system call when they need to access a device, which goes to the kernel.
- The kernel runs the stack associated with the device (and maybe formats the request for the device driver).
- The kernel invokes the correct driver, which configures the request for the device.
- The device driver issues commands and sends data via either Programming I/O (PIO) or Direct Memory Access (DMA) operations.
- The driver ensures data is delivered and is not overwritten
For the reverse, everything happens backwards.
What is OS-bypass?
In OS-bypass, devices are accessed directly by user processes. This requires a user-level driver (a library).
What are the two different routes we can take from device to CPU? What are their pros and cons?
1) Devices can generate interrupts to the CPU.
The downside is the interrupt handlers, which cost CPU cycles. There may be setting/resetting of interrupt masks as well as more indirect effects due to cache pollution. That being said, interrupts can be triggered by the device as soon as the device has information for the CPU.
2) CPUs can poll devices by reading their status registers to determine if they have some response/data for the CPU.
For polling, the OS has the opportunity to choose when it will poll. The OS can choose to poll at times when cache pollution will be at its lowest. However, this strategy can introduce delay in how the event is observed or handled, since the handling happens at some point after the event has been generated by the device. In addition, too much polling may introduce CPU overhead that may not be affordable.
Describe programming I/O (PIO)
In PIO, the CPU writes directly to device registers. Registers are small, though, so sending large amounts of data requires repeated CPU writes and acknowledgements on the part of the device.
Describe direct memory access (DMA)
DMA is only possible with additional hardware (DMA controller). With DMA, the CPU creates a buffer in DRAM that the device can access directly without the CPU intermediating.
What does the virtual file system (VFS) do?
The virtual file system abstracts away the details of underlying file systems, allowing user processes to interact with a variety of file systems (both local and remote) as if they were all one.
What are the elements of the VFS stack?
- File
- Inode
- Directories
- dentry and deentry cache
- superblock
What is a file?
The VFS supports a file abstraction: this is what user processes interact with and the element that the VFS operates on. It’s a logical storage unit that maps to a physical storage location. The OS represents files with file descriptors (integers), which are created when a file is opened.
What is an inode?
For each file, the VFS maintains an inode. The inode is a persistent data structure that maintains information about a file, including a list of all data blocks (pointers to data blocks) that correspond to the file, permissions, file size, whether the file is locked, etc. The inode is necessary because a file’s data blocks may be scattered all over the storage medium.
What is the problem related to inode size and how do we get around it?
The size of the inode can limit the potential size of a file. A 128b inode with 4b pointers address 1kb blocks can have at most 32 data block, so 32kb file. To expand possible file size, we use indirect pointers, which point to blocks of pointers (which themselves can point to more blocks of pointers, or point directly to data blocks).
What are VFS directories, dentries, and dentry caches?
The OS system maintains directories, which are like files but with information about files and their inodes, and a dentry (directory entry) for each component of a path (/, /user, and /user/myname separately) that is traversed while getting to a file. These components are held in the dentry cache, which in other words contains info on previously visited directories. This can be used later when searching for other files. dentry objects exist only in memory, they are not persistent. It maintains a map the filesystem uses to find inodes and data blocks.
What is is a superblock?
The superblock abstractions maintains info about how a filesystem is laid out on a storage device. The superblock works like a map, helping the filesystem find inodes and data blocks.
What are four optimizations that can reduce the overheads associated with accessing the physical device?
- Caching
- I/O scheduling
- Prefetching
- Journaling/Logging
What is caching (with respect to filesystems)?
Filesystems can cache blocks in main memory to reduce the number of disk accesses
What is I/O scheduling?
One of the largest overheads is moving the disk head, so filesystems benefit from smart scheduling of IO requests to reduce disk head movement. It maximizes sequential vs random access.