25. Log-Structured File Systems Flashcards
From 1982 to 1991, what happened to the relative speed of seeks on hard disks?
Disk seek times are still dog slow.
They are becoming a performance bottleneck
How do we make a big, slow thing look faster?
Use a cache (IOW, put a smaller, faster thing in front of it)
In the case of the file system, the smaller, faster thing is memory
What do we call the memory used to cache file system data?
The buffer cache
With a large cache, what benefit do we get when it comes to disk reads?
With a large cache, we should be able to avoid doing almost any disk reads
How does a cache assist with the latency associated with writes to disk?
The cache will help collect small writes in memory until we can do one larger write
What is the best way to avoid seeks when writing?
Write everything to the same place
More realistically, write everything really really close to each other
What is LFS?
Log-structured File System
What is the main idea behind log-structured file systems?
All writes go to an append-only log
What is an example of a write to disk using a normal and cached-read write approach?
Let’s say we want to change an existing byte in a file
Normally, we do this:
- Seek to read the inode map
- Seek to read the inode
- Seek to write (modify) the data block
- Seek to write (update) the inode
Let’s assume that our big friendly cache is going to soak up the reads. Now what happens?
- Seek to read the inode map
- Seek to read the inode
- Seek to write (modify) the data block
- Seek to write (update) the inode
When do we write to the log?
When the user calls sync, fsync, or when blocks are evicted from the buffer cache
How did FFS translate an inode number to a disk block?
It stored the inode map in a fixed location on disk
Why is it a problem for LFS that it stores the inode map in a fixed location on disk?
Inodes are just appended to the log and so they can move!
What does LFS do to handle the fact that the inodes can move?
It logs the inode map
How does LFS handle file system metadata, like inode and data block allocation bitmaps, etc?
We can log that stuff too!
What happens when the log reaches the end of the disk?
There is probably a lot of unused space earlier in the log due to overwritten inodes, data blocks, etc.
This space is reclaimed when we clean the log by identifying empty space and compacting used blocks
Conceptually you can think of this happening across the entire disk all at once, but for performance reasons LFS divides the disk into segments which are cleaned separately