23. File System Caching and Consistency Flashcards

1
Q

How do we make a big, slow thing look fast?

A

Use a cache! (Put a smaller, faster thing in front of it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a buffer cache?

A

The memory used to cache file system data.

Memory serves as the smaller, faster thing (compared to disk) that we use to make a big, slow thing (disk) look faster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are two ways that operating systems use memory?

A
  1. As memory (duh)

2. To cache file data in order to improve performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When the OS splits memory use into main mem and file system buffer cache, what are the consequences of allocating a big buffer cache and a small main memory?

A

With a big buffer cache and a small main memory, file access is fast, but there is an increased potential for thrashing in the memory subsystem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When the OS splits memory use into main mem and file system buffer cache, what are the consequences of allocating a small buffer cache and a large main memory?

A

With a small buffer cache and a large main memory, little swapping occurs (good), but file access is extremely slow.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the swappiness Linux kernel parameter do?

A

It controls how aggressively the operating system prunes unused process memory pages and hence the balance between memory and buffer cache.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe what happens for open, read, write, and close operations for “above the file system” placement of the buffer cache.

A

Open - Pass down to the underlying file system

Read

  • If the file is not in the buffer cache, pass down to underlying file system and load contents into the buffer cache
  • If the file is in the cache, return the cached contents

Write

  • If file is not in the buffer cache, pass load contents into the buffer cache and then modify them
  • If the file is in the cache, modify the cached contents

Close
- Remove from the cache (if necessary) and flush contents through the file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe the pros and cons of “above the file system” placement of the buffer cache.

A

Pros:
- Buffer cache sees file operations and may lead to better prediction or performance

Cons:

  • Hides many file operations from the file system, preventing it form providing consistency guarantees
  • Can’t cache file system metadata: inodes, superblocks, etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

With the “above the file system” implementation of the buffer cache, what do we cache?

A

Entire files and directories!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the buffer cache interface for “above the file system” implementation?

A

open, close, read, and write

Same as the file system call interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

With the “below the file system” implementation of the buffer cache, what do we cache?

A

Disk blocks!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the buffer cache interface for “below the file system” implementation?

A

readblock and writeblock

Same as the disk interface

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the pros and cons of “below the file system” placement of the buffer cache.

A

Pros:

  • Can cache all blocks including file system data structures, inodes, superblocks, etc
  • Allows file system to see all file operations even if they eventually hit the cache

Cons:
- Cannot observe file semantics or relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Do modern operating systems place the buffer cache “above” the file system or “below” the file system?

A

Below the file system

To understand why, think above each of the pros of placing it below the file system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can the cache cause consistency problems?

A

Objects in the cache are lost on failures (like loss of power)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does almost every file system operation involve?

A

Modifying MULTIPLE disk blocks

17
Q

Walk through the 5 steps of creating a new file in an existing directory. Include mention of what would happen if a crash occurs at each step.

A
  1. Allocate an inode, mark the used inode bitmap
    [If crash here - an inode is incorrectly marked in use - down one inode]
  2. Allocate data blocks, mark the used data block bitmap
    [If crash here - Data blocks are incorrectly marked in use.]
  3. Associate data blocks with the file by modifying the inode
    [If crash here - Dangling file not present in any directory]
  4. Add inode to the given directory by modifying the directory file
    [If crash here - IDK]
  5. Write data blocks
    [If crash here - data loss]
18
Q

Observation: File system operations that modify multiple blocks may leave the file system in an inconsistent state if partially completed.

How does caching exacerbate this situation?

A

May increase the time span between when the first write of the operation hits the disk and the last is completed.

19
Q

What is the safest approach to dealing with maintaining file system consistency?

A

Don’t buffer writes!

We call this a “write through” cache because writes do not hit the cache.

20
Q

What is the most dangerous approach to dealing with maintaining file system consistency?

A

Buffer ALL operations until blocks are evicted.

We call this a “write back” cache

21
Q

Which approach to maintaining file system consistency is better for safety?

A

“Write through” caching, where writes go directly do disk and don’t hit the cache.

Now we can’t lose writes on power failure.

22
Q

Which approach to maintaining file system consistency is better for performance?

A

“Write back” caching, where we buffer all operations until blocks are evicted (ran out of space in memory).

This goes way faster than “write through”, but it’s dangerous because in the event of a power failure, everything not written onto disk (consistency) is lost.

23
Q

Describe a middle ground approach (between performance and safety) to maintaining file system consistency.

A

Write important file system data metadata structures - superblock, inode maps, bitmaps, etc - immediately, but delay data writes

File systems also give use processes some control through synch (synch the entire file system) and fsync (sync one file)

24
Q

What is “journaling” in the context of file system consistency?

A

Track pending changes to the file system in a special area on disk called the journal.

Following a failure, replay the journal to bring the file system back to a consistent state.

25
Q

What happens (with the journal) when we flush cached data to disk?

A

Update the journal! This is called a checkpoint.

Example:
Journal has 4 changes logged in it. A cache flush is run. Now the journal says that there is nothing that needs to be done to the cache (in the event of a failure) and marks a checkpoint in the journal.

Note: any journal entry left after a checkpoint is an indication that there is still work that needs to be done to get back to a consistent state in the cache

26
Q

When journaling is implemented, what happens on recovery?

A

Start at the last checkpoint and work forward, updating on disk structures as needed.