File System Crash Consistency Flashcards by Cloud Kanou

File system data structures must _______

persist

How well did you know this?

Not at all

Perfectly

File system data structures are stored in _______ devices to survive for a long time

storage

How well did you know this?

Not at all

Perfectly

The main challenge is to _____ persistent data structures in spite of _________ and _____ failures

update, crashes, power

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with only 1 write succeeding, what happens of only the data block is written to disk?

This is as if the write did not happen because the inode was never updated -> nothing to worry about

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with only 1 write succeeding, what happens if only the inode is updated?

The inode points to a new data block, but there’s no actual data there -> inode reads garbage data (inconsistency)

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with only 1 write succeeding, what happens if only the bitmap is written?

Bitmap indicates data block is used, but nothing points to it -> space leak as data block will never be used (inconsistency)

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with 2 writes succeeding, what happens of the inode and bitmap are written?

The data block is not updated -> following inode leads to garbage data (inconsistency)

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with 2 writes succeeding, what happens when the inode and data block are written?

Allocation of data block not recorded -> data block may get re-allocated (data corruption)

How well did you know this?

Not at all

Perfectly

In a scenario with 3 writes to data block, inode, and data bitmap with 2 writes succeeding, what happens when the bitmap and data block are written?

We have no idea which file the new allocated data block belong to

How well did you know this?

Not at all

Perfectly

The goal of the crash consistency problem is to move the file system from one consistent state to another ________

atomically

How well did you know this?

Not at all

Perfectly

what are 2 challenges of the crash consistency problem?

Disk only commits one write at a time (but we need to do many)
Crashes a power failures may happen between writes

How well did you know this?

Not at all

Perfectly

One solution to the crash consistency problem is the file system _______

checker

How well did you know this?

Not at all

Perfectly

The fsck is a _____ used for finding inconsistencies in file system and ____ them

tool, fix

How well did you know this?

Not at all

Perfectly

What structures are checked by fsck?

superblock, free blocks, inode state, inode pointers, etc

How well did you know this?

Not at all

Perfectly

fsck cannot fix all problems because a file system may look ________ but the inode points to _______

consistent, garbage

How well did you know this?

Not at all

Perfectly

The real goal of the fsck is to make sure the file system ________ is consistent

metadata

How well did you know this?

Not at all

Perfectly

fsck performs _______ checks on the superblock, like making sure file system size > number of _____

sanity, blocks

How well did you know this?

Not at all

Perfectly

How does fsck check free blocks?

Scan inodes, direct, and indirect blocks to learn which blocks are allocated, then cross check with the bitmaps

How well did you know this?

Not at all

Perfectly

If the fsck suspects the inode state is corrupted after a ______ check, it clears it as it is not an easy fix

sanity

How well did you know this?

Not at all

Perfectly

What does fsck check for inode links and how does it do it?

fsck checks the link count. It does it by scanning the entire directory tree to build up new link counts

How well did you know this?

Not at all

Perfectly

fsck checks for duplicates by checking if 2 different _______ point to the same ____

inodes, block

How well did you know this?

Not at all

Perfectly

fsck checks for bad blocks by scanning through the list of _______ and checking for bad pointers i.e. pointer pointing to block number outside ______ size

pointers, partition

How well did you know this?

Not at all

Perfectly

How does fsck check directories?

fsck does integrity checks. Make sure . and .. are first 2 entries and each inode referred to in the directory is allocated

How well did you know this?

Not at all

Perfectly

fsck does not know about the _______ of normal files

content

How well did you know this?

Not at all

Perfectly

fsck needs detailed _______ of the file system

detailed

fsck is an ________ tool and is typically invoked at ______

offline, boot

What is the main disadvantage of fsck?

It is too slow as it scans the entire disk

Another solution to crash consistency problem is __________

journaling

In journaling, we note down "what to do" before actually doing the ________

updates

Journaling is also known as _______-______ logging

write-ahead

In journaling, any permanent change must be recorded in the ____ first

log

In journaling, the log is also stored _______ on disk

permanently

In journaling, if there is a ______, we look at the log upon reboot and try again

crash

Journaling makes more work during _______, but reduces work during ______ ________

updates, crash recovery

The journal occupies space within the _______

partition

In data journaling, update content is wrapped in a _________

transaction

What is a transaction?

Updates that must all happen atomically or not

Transactions are also called _______ _______

physical logging

What are the 5 components in a transaction?

TxB, I[v], B[v], Db, TxE

In data journaling, both the TxB and TxE contains the __________ ID

transaction

In journaling, a _______ is when the transaction is safely logged and the updates actually happen on disk

checkpoint

In our first naive procedure for write(), what 2 steps do we have?

1. Journal write (transaction data to log) | 2. Checkpoint (write updates to final location)

Our first naive procedure for write() with 2 steps doesn't work because our transaction data may be written to the...

out of order

What are 3 ways to write a transaction to log?

1. One request at a time (for a total of 5 writes) 2. One big write containing all the writes 3. Two writes. One for everything except TxE, and one for TxE

Writing the transaction one request at a time 5 times is ___

slow

Writing the transaction with one big request is _______ because the writes finish in any order and items between TxB and TxE could be missing, but seems fine to the system after a crash.

unsafe

Writing the transaction with 2 requests depends on the _______ guarantee of the storage device. TxE has to be ________than a fixed size

atomicity, smaller

In our second naive procedure for write(), what 3 steps do we have?

1 Journal Write (everything except TxE) 2. Journal Commit (TxE) 3. Checkpoint

How do we recover when a crash happens before the transaction is written?

Skip the pending updates

When a crash happens after a transaction is written bu before the checkpoint, we scan all _______ transactions then checkpoint them

committed

When a crash happens during checkpointing, we ____ the checkpointing process

redo

If we modify two files in the same directory, the same block gets updated twice using two __________, leading to ________ writes

transactions, excessive

To prevent the same block getting updated twice, we can _______ journal writes into a ________ transaction

buffer, global

When batching log writes, we mark relevant block as _____, add them to a list of blocks to write for the current ___________, then commit the global transaction after a _________

dirty, transaction, timeout

What two problems arise when the log becomes full?

1. Long recovery time | 2. No further transactions can be committed to the disk

To prevent the log from becoming full, we treat the log as a ________ structure and reuse journal space once the transaction has been __________

circular, checkpointed

The journal superblock marks the _____ and ______ transactions in the log

oldest, newest

What four steps do we have in the final write procedure with journaling?

1. Journal Write (except TxE) 2. Journal Commit 3. Checkpoint 4. Free (mark log space as free by updating journal superblock)

A problem of data journaling is that every data block is written to disk _____-

twice

In metadata journaling, we only log changes to _______

metadata

Metadata journaling is also known as _______ ______

ordered journaling

What are the 2 alternatives to picking when to write data blocks to disk during metadata journaling?

1. Write data after transaction | 2. Write data to disk before transaction

In metadata journaling, if we write data to disk after the transaction, the inode may end up pointing to ________ _____ if the data write fails

garbage data

What are the 5 steps in the write protocol with metadata journaling?

1. Data Write )write data to final location) 2. Journal metadata write (no TxE) 3. Journal Commit 4. Checkpoint metadata 5. Free

Block reuse can cause data corruption when say block 1000 is updated, deleted, then created into something else. If a crash occurs, the newly created directory would be ______ instead of the old thing that was at block 1000

modified

On solution to the problem of block reuse is adding a ______ _______ to the journal

revoke record

The revoke record prevents data from being __________ during journal replay

replayed

File System Crash Consistency Flashcards

(67 cards)