21. Intro to File Systems Flashcards
What are the 5 components of the UNIX file interface?
- open(“foo”): “I’d like to use the file named foo”
- close(“foo”): “I’m finished with foo”
- read(2): “I’d like to perform a read from file handle 2 at the current position”
- write(2): “I’d like to perform a write from file handle 2 at the current position”
- lseek(2, 100): “Please move my saved position for file handle 2 to position 100”
What does a hierarchical file system allow users to do (or not do)?
Users don’t have to look at everything all at once. Hierarchical file systems allow users to store and examine related files together.
Ex: letters/Mom/letter.txt letters/Chuchu/woof.txt letters/Suzanna/letter1.txt letters/Suzanna/letter2.txt
Describe how location plays a role in hierarchical file systems.
Locations for files require the ability to navigate up and down directories, so directories need to include pointers to other (deeper or higher) directories.
Location is also only meaningful if it is tied to the name of the file, so hierarchical file systems implement name spaces, whcih require that a file’s name map to a single unique location within the file system
Why do file systems usually require that files be organized into an acyclic graph with a single root (aka tree)?
An acyclic graph means there is only one possible path from the root of the tree (top-level directory) to the file you’re looking for.
If there is a cycle in your graph, it is possible that you could have multiple (infinite) possible canonical names for your file, which violates the one-name-one-location rule of hierarchical file systems.
How many relative and canonical names exist for a given file in a tree-implemented hierarchical file system?
One canonical name (/you/used/to/love/well) and an infinite number of relative names (you/used/to/love/me/../well)
(love/me/../../love/me/../well)
What are the 5 file system design goals?
- Efficiently translate file names to file contents
- Allow files to move, grow, shrink, and otherwise change
- Optimize access to single files
- Optimize access to multiple files, particularly related files
- Survive failures and maintain a consistent view of file names and contents
What are the two types of disk blocks?
Data blocks (contain file data)
Index nodes (aka inodes, contain non-file data)
What three decisions distinguish one file system from another?
- On-disk layout. (How does each file system decide where to put data and metadata blocks in order to optimize file access?)
- Data structures. (What data structures does each file system use to translate names and locate file data?)
- Crash recovery. (How does each file system prepare for and recover from crashes?)
What is the primary challenge of file systems? Why is it hard?
To maintain a large and complex data structure using disk blocks as storage
This is hard because making changes potentially requires updating many different structures
Say a process wants to write data to the end of a file. What does the file system have to do?
- Find empty disk blocks to use and mark them as in use.
- Associate those blocks with the file that is being written to.
- Adjust the size of the file that is being written to
- Actually copy the data to the disk blocks being used
* From the perspective of a process, all of these things need to happen synchronously
* This creates a consistency and a performance problem
What is a sector?
The smallest unit that the disk allows to be written, usually 256 bytes.
What is a block?
The smallest unit that the file system actually writes, usually 4K bytes
What is an extent?
A set of contiguous blocks used to hold part of a file. Described by a start and end block
Why would file systems not write chunks smaller than 4K?
Because contiguous writes are good for disk head scheduling and 4K is the page size which affects in-memory file caching
Why would file systems want to write file data in even larger chunks?
Because contiguous writes are good for disk head scheduling and many files are larger than 4K!