0. Refresher Materials Flashcards
Refresh on file systems, memory systems, multithreaded programming, and networking.
What key abstractions does a file system use?
- File 2. Filename 3. Directory Tree
How does UNIX keep track of access rights?
For each user / file pair we need to know which permissions that user has (e.g., read, write, execute). Unix assigns an owner (aka user) and group to every file. There are separate read, write, execute bits for owner/user, group, and other.
How do read, write, and execute rights affect directories in UNIX?
Reading affects your ability to see what’s in a directory. Writing affects your ability to create, delete, and rename files in a directory. Execute affects your ability to “pass through” a directory and manipulate its contents in ways other than manipulating file names. NOTE: To create a new file in a directory from outside the directory you would need both write and execute permissions because you need to be able to “pass through” the directory (execute) and create the file (write)
What are the two primary ways of interacting with a file
- cursor 2. memory
What are two allocation strategies for keeping track of whether a block is free?
- Free list 2. Busy bit vector
How does a file allocation table (FAT) represent a file? What role does a directory table play?
Each file is represented as a linked list of blocks. The links in the list are represented in the file allocation table (FAT), which is indexed by block number. Given the starting block number of a file, it’s possible to find the next block with a constant time access to the file allocation table. -1 indicates that a block is the last in the chain. The file allocation table also contains a bit that indicates whether the block is free. The file allocation table contains data about all blocks in all files. The directory table tells us the starting block of a file.
How does inode represent a file?
Each file on disk has an inode structure associlated with it.
An inode contains metadata, 12 direct pointers, 1 indirect pointer, 1 double indirect pointer, and 1 triple indirect pointer.
Just like in FAT, directories are treated like files and also use inode structures.
What’s a buffer cache and how does it optimize file access?
How are reads affected?
How are writes affected?
A buffer cache is an in-memory cache of the contents on disk. When data is read from disk it’s stored in the buffer cache so subsequent reads can access it there.
Writing to disk is usually done with a write-back policy. The change is made only in the buffere cache at first and the page is marked as dirty. The slower operation of writing the data to disk is postponded to a more oportune time. Thus, a call to write isn’t a guarantee that the data is changed on disk (call fsync or msync to flush changes to disk).
Pros
- The program can resume faster after a write
Cons
- If the program crashes before the write to disk, changes are lost
What is journaling and how does it optimize file access?
We write changes to a contiguous block of memory quickly and mark the dirty blocks of memory as clean. Then at a more oportune time we apply all the changes in the journal to their actual positions on disk (taking the time penalty to fseek).
Pros
- Writing to the journal is fast, and that’s the write we do at the critical time where we find we have too many dirty blocks
- Prevents data loss
Cons
- Any time we read from the disk we first have to check if there are changes in the journal
- Total time spend writing increase since we have to do it twice
What’s direct memory access and how does it optimize file access?
Streaming devices like a disk have their own controller that is capable of sending along the bus.
The CPU sends commands to the DMA controller along the bus. While this still means the CPU and the DMA controller are competing for use of the bus, the CPU is able to find data in the cache, which reduces it’s use of the buss and allows the DMA controller to “steal” cycles.
Compare Memory vs Disk
Memory
- Fast
- Random access (constant time lookups)
- Temporary
Disk
- Slow
- Sequential access
- Durable
What is the motivation for using a cache?
It significantly speeds up time to look up data because it’s much faster to access the cache than main memory. If we get a cache hit, we don’t have to go to main memory. If we don’t get a hit, we have to go to main memory, but we can store the data in the cache to save ourselves the trip next time.
What is memory heirarchy and why is it useful?
The main tradeoff w/ caches is speed vs capacity. A small cache means a fast lookup, but a low hit rate whereas a large cache means a slower lookup, but a higher hit rate.
A memory heirarchy means we have several caches of various sizes, which allows us to get both speed and capacity.
What is cache locality?
What is temporal cache locality?
What is spacial cache locality?
In order for a cache to be effective, it needs a high hit rate, which means we need to be able to anticipate what data it will need next. We use locality to determine this.
Temporal locality is the tendency to refer to the same memory close in time. (e.g., a caching policy puts data you’ve just used in the cache w/ the assumption you’re likely to use it again soon)
Spacial locality is the tendency to access memory that is close in address. (e.g., a caching policy puts not just the memory address you accessed into the cache, but the whole block)
Given an address how do I determine if the data is in the cache (direct map cache)?
The offset tells us where within a cache block our data is. The rest of the bits tell us the cache number.
The index tells us the right index to look for our data in.
The tag acts as a confirmation to differentiate data with the same index :/