25. Log-Structured File Systems Flashcards

Question 1

Q

From 1982 to 1991, what happened to the relative speed of seeks on hard disks?

Answer

A

Disk seek times are still dog slow.

They are becoming a performance bottleneck

Question 2

Q

How do we make a big, slow thing look faster?

Answer

A

Use a cache (IOW, put a smaller, faster thing in front of it)

In the case of the file system, the smaller, faster thing is memory

Question 3

Q

What do we call the memory used to cache file system data?

Answer

A

The buffer cache

Question 4

Q

With a large cache, what benefit do we get when it comes to disk reads?

Answer

A

With a large cache, we should be able to avoid doing almost any disk reads

Question 5

Q

How does a cache assist with the latency associated with writes to disk?

Answer

A

The cache will help collect small writes in memory until we can do one larger write

Question 6

Q

What is the best way to avoid seeks when writing?

Answer

A

Write everything to the same place

More realistically, write everything really really close to each other

Question 7

Q

What is LFS?

Answer

A

Log-structured File System

Question 8

Q

What is the main idea behind log-structured file systems?

Answer

A

All writes go to an append-only log

Question 9

Q

What is an example of a write to disk using a normal and cached-read write approach?

Answer

A

Let’s say we want to change an existing byte in a file

Normally, we do this:

Seek to read the inode map
Seek to read the inode
Seek to write (modify) the data block
Seek to write (update) the inode

Let’s assume that our big friendly cache is going to soak up the reads. Now what happens?

Seek to read the inode map
Seek to read the inode
Seek to write (modify) the data block
Seek to write (update) the inode

Question 10

Q

When do we write to the log?

Answer

A

When the user calls sync, fsync, or when blocks are evicted from the buffer cache

Question 11

Q

How did FFS translate an inode number to a disk block?

Answer

A

It stored the inode map in a fixed location on disk

Question 12

Q

Why is it a problem for LFS that it stores the inode map in a fixed location on disk?

Answer

A

Inodes are just appended to the log and so they can move!

Question 13

Q

What does LFS do to handle the fact that the inodes can move?

Answer

A

It logs the inode map

Question 14

Q

How does LFS handle file system metadata, like inode and data block allocation bitmaps, etc?

Answer

A

We can log that stuff too!

Question 15

Q

What happens when the log reaches the end of the disk?

Answer

A

There is probably a lot of unused space earlier in the log due to overwritten inodes, data blocks, etc.

This space is reclaimed when we clean the log by identifying empty space and compacting used blocks

Conceptually you can think of this happening across the entire disk all at once, but for performance reasons LFS divides the disk into segments which are cleaned separately

Question 16

Q

When should we run the cleaner in LFS?

Answer

Study These Flashcards

A

Probably when the system is idle, which may be a problem on systems that don’t idle much

Question 17

Q

What size segments should we clean when doing a log clean?

Answer

Study These Flashcards

A

Large segments amortize the cost to read and write all of the data necessary to clean the segment

But small segments increase the probability that all blocks in a segment will be “dead”, making cleaning trivial

Question 18

Q

What can we say about the performance effects of log cleaning?

Answer

Study These Flashcards

A

Cleaner overhead is very workload-dependent, making it difficult to reason about the performance of log-structure file system

(and easy to fight about their performance!)

Question 19

Q

Let’s say that the cache does not soak up as many reads as we were hoping. What problem can LFS create?

Answer

Study These Flashcards

A

Block allocation is extremely discontiguous, meaning that reads may seek all over the disk

Question 20

Q

What are three arguments in favor of reading research papers?

Answer

Study These Flashcards

A

Researchers have some great ideas about how to improve computer systems! Many times the design principles apply outside of the specific project described in the paper
Both academic and industrial research labs publish papers. Frequently the best/only way to find out details about exciting production systems in use by companies like Google, Microsoft, Facebook, etc.
Because reading the code takes way too long!

Question 21

Q

What drives research in computer systems?

Answer

Study These Flashcards

A

It aint features.

Hardware changes and other technology trends which both expose problems with existing systems and create new opportunities for better systems.

25. Log-Structured File Systems Flashcards

(21 cards)