File System Implementation (14) Flashcards

Question 1

Q

FS Structure

Answer

A

File structure
- Logical storage unit
- Collection of information that are related
File systems resides on disks
- Provided user interface to storage, that maps logical to physical addr
- Provides efficient and cheap access to disk: allows data to be stored, located and retrieved
Disk provides in-place rewrite and random-access
- I/O transfers perfomed in blocks of sectors (usually 512bytes)
FCB (File control block): structure where all file info are contained
Device driver: controls the physical dev
The file system is layered.

Question 2

Q

Layers of the File system

Answer

A

Device drivers manage I/O devices at the I/O control layer

Given commandsl ike“read drive1,cylinder72,track2,sector 10, into memory location 1060” outputs low-level hardware specific commands to hardware controller

Basic file system given command like “retrieve block 123” translates to device driver

Also manages memory buffers and caches (allocation, freeing, replacement)
- Buffers hold data in transit
- Caches hold frequently used data

File organization module understands files, logical address, and physical blocks

Translates logical block # to physical block #
Manages free space and disk allocation

Logical File system manages metadata info

converts file names into numbers, file handle, location by mantaining file control blocks
directory management
protection

Layering is useful for reducing complexity and redundancy, but adds overhead and can decrease performance.

Operating Systems support more than one file system at the time.

Question 3

Q

File Control Block info

Answer

A

FCB that describes a file contains many details:
- I node number, permission, size, dates
- NTFS stores the FCB into the master file table in a similar fashion to relational db

Question 4

Q

File System Operations

Answer

A

Boot control block

contains info needed by system to boot OS from that volume
Needed if volume contains OS, usually first block of volume

Volume control block

contains volume details
Total # blocks, # of free blocks, block size, free block pointers or array

Direcotry structures:

Organizes the files: names, inode numbers, master file table

Question 5

Q

In-memory file system structures

Answer

A

Mount table storing file system mounts, mount points, file system types
system-wide open-file table contains
- a copy of the FCB of each file
- synchronization primitives
- ref. count
- block # and offset in the block
per-process open-file table contains pointers to appropriate entries in system-wide open-file table as well
- access method
- pointer to file
- take a look at pic
- More processes may open the same file, on system-wide table there is only one entry per file.
Open return a file handle
Buffers hold data blocks that come from disk
Data read is eventually copied into a specif user process memory

Question 6

Q

Directory Implementations

Answer

A

Linear List of file names

Simple to code
Time-consumeing to execute O(n)
- use B+-Trees?

Hash Table

O(1) search time
Collissions must be dealt with
Good if entries are fixed size
- or we should use chained overflow method

Question 7

Q

Allocations Methods: Contiguos

Answer

A

Contiguous allocation – each file occupies set of contiguous blocks

Usually best performance
Simple
- only starting location (block#) and length(number of blocks) are required
Problems:
- finding space for file
- knowing file size
- external fragmentation
- need for compaction off-line (downtime) or on-line
  - Compaction is the process of grouping the free space in a specifi part of the disk

Question 8

Q

Allocation methods: Extent-Based Systems

Answer

A

Many newer file systems (i.e., Veritas File System) use a modified contiguous allocation scheme.

An extent is a contiguous block of disks

Extents are allocated for file allocation
A file consists of one or more extents

Contiguos chunk is allocated initially, if the amount is not large enough, another chunk of contiguos space is added.

The location of a file’s block is recorded, as well as block count and a link to the first block of the next extent.

Question 9

Q

Allocation Methods: Linked

Pros and Cons

Answer

A

Linked allocation: each file a linked list of blocks

File ends at nil pointer
Good:
- No external fragmentation: the are no small unused small memory blocks
- No compaction: no need to move objects around
- Free space management system called when new block is needed
- Improve efficiency by clustering blocks into groups but increasse internal frag.
Bad
- Locating a file may take many I/Os and disk seeks
- Reliability can be a problem

Question 10

Q

FAT (File Allocation Table)

Answer

A

Linked allocation with some features;

beginning volume has table indexed by block number
similar to linked list, but faster on disk and cacheable
new block allocation simple
provided that the FAT is reliable, if we loose one block, we have the rest of the file
- this is a limite of non-FAT linked allocation methods

Question 11

Q

Linked Allocation

Answer

A

Each file is a linked list of blocks
Mapping
- If we consider that a pointer has a size of one word
  - for each block we have 511 words for data and 1 for the pointer
  - we need to divide the size by 511 to get the required number of blocks
- LA/511
  - Q is the logical address
  - Displacement into block is R + 1
  - Physical and logical cannot be mapped directly there needs to be an addition

Question 12

Q

Allocation Methods: Indexed

Answer

A

Each file has its own index block(s) containing pointers to data blocks
Index table is needed
random access
dynamic access without external fragmentation, but have overhead of index block
Mapping from logical to physical in a file of maximum size of 256K bytes and block size of 512bytes. We need only 1 block for index table
- LA/512
  - Q = displacement into index tbale
  - R = displacement into block

Question 13

Q

Allocation Methods: Indexed

Unbounded Length File

Answer

A

Linked Scheme:

mapping from logical to physical in a file of unbounded length (block size of 512 words)

LA/(512x511) =
- Q1: displacement of index table
- R1/(512) =
  - Q2: displacement into block index table
  - R2: displacement into block of file

Two levels:

4K blocks can store 1024 four-byte pointers in outer index, that is 1M data blocks and up to 4GB of files.

As before but we don’t have the pointer displacement in the block:

LA/(512x512) =
- Q1: displacement into outer-index table
- R1/(512) =
  - Q2: displacement into block index table
  - R2: displacement into block of file

Question 14

Q

Allocation: Combined Scheme: UNIX UFS

Answer

A

4K bytes per block, 32-bit addresses

More index blocks than can be address with 32-bit file pointer.

The inode is a sort of multi-level index table, that allows small files to be stored in the first few blocks, but if the file is larger, the single/double/triple index are used.

Question 15

Q

Performance of Access Methods

Answer

A

Best method depends on what we are going to do
- contiguos is great for sequential and random
- Linked good for sequential, not random
Declare access type at creation
Index more complex
- Single block access could require 2 index block reads then data block read
- clustering can help improve throughput, reduce cpu overhead
For nvm, no disk head so different algorithms and optimizations needed
- old algo uses many cpu cycles trying to avoid non-existend head movement
- with nvm goal is to reduce cpu cycles and overall path needed for I/O’
Doing other tasks during the I/O operations must be done, otherwise the CPU will be useless for a lot of time

Question 16

Q

Free Space Management:

Answer

Study These Flashcards

A

Free space list
- list of free blocks linked
- cannot gat contiguos space easily
- no space waste, because we can insert the information about wheter the next block is free inside the metdata of the block.
- no need to traverse to find a free node
Bit vector or bit map
- bit map requires extra space
- easy to get contiguos file
- Example
  - Block size : 4KB = 2^12
  - Disk size: 1TB = 2^40
  - n = 2^40/2^12 = 2^28 bits (32MB)
  - We could group blocks of 4, b[i] (are blocks i * 4, i *4 + 1, i * 4 + 2, i * 4 + 3 free ?), and the dimension would be 8MB

Grouping

Linked list stores address of next n-1 free blocks in first free block, plus a pointer to next block that contains another group of free-block pointers.

and counting:

because space is frequently used and freed contiguosly
- keep address of first free block and count of following free blocks
- free space list then has entries containing addresses and counts

Space maps:

used in zfs

Question 17

Q

TRIMing Unused Blocks

Answer

Study These Flashcards

A

SSD need to be erased before being written
HDDs can write in place

We need different strategies for different types of storage.

TRIMing is used in NVM:

Blocks keep data after they are freed (untile overwritten), only the pointer will be destroyed.
Since NVM must be erased before written, that erases are made in large chunks (blocks, composed of pages) and that the latter is a slow process
TRIM is a way that the file system has to inform the device that the page is free
- Garbage collection can be done, or if free block can be erased.

Question 18

Q

Efficiency and performance

Answer

Study These Flashcards

A

Efficiency dependent:

disk allocation and directory algos
types of data kept in file’s directory entry
pre-allocation or as-needed allocation of metadata structures
fixed-size or varying size data structures

Performance:

keep data and metadata close
buffer cache - section of main memory used for frequently used blocks
synchronous writes: requested by some apps
- no bufferin/ caching :
free-behind and read-ahead
- optimize sequential access
- read-ahead
  - if sequential read and I read b 10
  - read also block 11
- free-behind
  - if I read 10
  - I won’t read 10 again, I can free it
Reads frequently slower than writes

Question 19

Q

Page Cache

Answer

Study These Flashcards

A

Different from buffer cache.

Cache pages rather thatn disk blocks using virtual memory techniques and addresses.

Memory-mapped I/O uses a page cache.

Routine I/O through the file system uses the buffer cach.

Question 20

Q

Unified Buffer Cache

Answer

Study These Flashcards

A

A unified buffer cache uses the same page cache to cache both memory-mapped pages and ordinary file system I/O to avoid double caching.

Question 21

Q

Recovery

Answer

Study These Flashcards

A

Done through consistency checking:

Comparing redunant data: directory structure vs data blocks on disk
- tries to fix incosistencies
- Can be slow and fail sometimes
Use file system programs to back up data from disk to another dev
Recover lost file by restoring file from backup

Question 22

Q

Log structured file system

Answer

Study These Flashcards

A

Log structured file systems record each metadata update to the file system as a transaction in the log.
Let’s put the intention of doing an operation, if the file system fails in the middle of the operation, we can recover it.
Faster recovery from crash, remove chances of inconsistencies.

File System Implementation (14) Flashcards

(22 cards)