File Systems Flashcards

1
Q

secondary storage

A

backs up main memory
usually a disk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

file systems provide

A

a mechanism for storage of data and programs on the disk and accessing them
files are mapped by OS to physical address

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

different storage devices

A
  • some transfer a character or a block of character at a time
  • some can be accessed sequentially/randomly
  • some transfer data synchronously/asynchronously
  • some can be read-only/read-write
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

essential requirements for long-term information storage

A
  1. It must be possible to store a very large amount of information
  2. Information must survive termination of process using it
  3. Multiple processes must be able to access information concurrently
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

file system

A
  • A way to organise and (persistently) store information
  • An abstraction over storage devices:
    • Hard disk, SSD, network, RAM, …
  • Organised in files and (typically) directories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

file system flow

A

user program syscall -> virtual fs -> page cache -> fs driver (FAT, NFTS…) -> buffer cache ->storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

files

A
  • Abstract storage nodes, i.e. logical storage unit
  • a file is a named collection of related information that is recorder on secondary storage
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

file types

A

regular: directories, soft links
special: device files, metadata…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

file structure

A

OS perspective: files as streams of bytes
program’s perspective: archives, executables, etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

file naming

A

extensions
name length
case sensitivity
once a file is named it becomes independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

file types

A
  1. executable
  2. object
  3. source
  4. batch
  5. text
  6. word processors
  7. library
    8.print or view
  8. archive
  9. multimedia
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

executable file

A

exe, com, bin or none
ready-to-run machine language program

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

object

A

obj, o
compiled machine language, not linked

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

source code

A

c, cpp, java, py
j

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

batch

A

bat, sh
commands to the command interpreter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

text

A

txt, doc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

word processor

A

wp, doc, rtf

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

library

A

lib, a
libraries f routines for programmes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

print or view

A

ps, pdf, jpg
ASCII or binary in a format for printing or viewing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

archive

A

arc, zip, tar
related files grouped into one file sometimes compressed for archiving and storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

multmedia

A

mp3,m mp4, avi
binary file containing audio or A/V info

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

MAC OS X file identification

A

each file has a type specified with it to identify its type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

UNIX file identification

A

magic number stored at the beginning of some files to indicate the type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

file attributes

A

name: human readable
identifier
type
location
size
creation date
last modified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Where are file attributes stored?

A

in file control block

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

file operations

A
  1. create & delete
  2. open & close
  3. read & write
  4. append
  5. seek
  6. get & set attributes
  7. rename
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

creating a file

A
  1. check if space is available and find spot for it
  2. make an entry in the directory
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

writing a file

A
  1. system call with the name of the file and the info to write
  2. OS finds the file
  3. write from the write pointer
  4. update pointer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

reading a file

A
  1. sys call name of the file and which part to read
  2. OS locates the file
  3. read from the pointer
  4. update pointer
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

seek a file

A
  1. OS finds the file
  2. current-file-position pointer is repositioned to a given value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

deleting a file

A
  1. OS finds the file
  2. release the space occupied by that file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

truncating a file

A
  1. keep all attributes same except file length → set to zero
  2. release the file space
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

ENOENT

A

file doesn’t exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

EBADF

A

bad file descriptor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

opening a file returns

A

a handel (file descriptor) for future operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

unlink(“file”)

A

removing a file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

rename(“file”, “new_name”)

A

renaming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

chmod(“file”, 0755)

A

change file permission

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

chown(“file”, uid, gid)

A

change owner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

sequential file access

A
  • information is processed in order
  • most common method in editors, compilers etc.
  • read: read next → advance pointer
  • write: write next → appends to the end of the file → move pointer to new end of file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

direct file access (relative access)

A
  • the file is viewed as a numbered sequence of blocks or records
  • great for accessing large amount of information → databases
  • read: read n → reads from block n
  • write: write n → writes to block n
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

directories

A
  • Data structures organising and maintaining information about files
  • Often stored as file entries with special attributes
  • disk is split into: partitions/slices/minidisk → combined to form volumes
    • each volume can be treated as a virtual disk → allows running multiple OSs and having multiple file systems
    • partition: directory + files
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

device directory

A

keeps information about the files in the system

44
Q

.

A

current dir

45
Q

..

A

parent dir

46
Q

dir in Unix

A

/

47
Q

dir in Windows

A

\

48
Q

single level directory system +/-

A

+ simple implementation
+ file operations are easily done: creating, deleting, repositioning, renaming
+ searching is fast for a small number of smaller files

  • more files, bigger files
  • all files must have unique names
  • users can’t have same file names
  • poor organisation → difficult to keep track of all the files (user view)
49
Q

hierarchical directory systems

A
  • allows users to create their own subdirectories and organise their files more efficiently
  • current directory: contains most of the files that are of current interest to the process
    • if a file is needed that isn’t in it the whole path name needs to be specified or cd
  • absolute path: from root
  • relative path: from current directory
50
Q

deleting a non empty dir

A

first delete all contents

51
Q

rm -r in UNIX

A

removes the whole dir

52
Q

mounting a file system

A
  1. OS is given the mount point (location within the file structure where the fs is to be attached)
  2. OS verifies the device contains a valid fs
  3. OS notes there is another fs
53
Q

file sharing

A
  • a user-oriented OS must
  • multiple users
    • system must mediate the file-sharing
    • the system can either
      1. allow a user to access files of other users by default
      2. require that a user specifically grant access to the files
54
Q

owner + group

A

owner: can change attributes and grant access
group: can execute a subset of operations defined by the owner

55
Q

remote file systems

A
  • methods of remote file-transfer
  • manually transferring files between system via programs like FTP (File Transfer Protocol
  • DFS (Distributed File System)
    • data stored on a server
    • accessed as if it was on the local client machine
  • WWW
    • a browser is needed to gain access to the remote files
56
Q

the client-server model

A
  • machine containing the files: server
  • machine seeking access: client
  • client identification (e.g. IP address - unsafe, encryption keys - not scalable)
57
Q

file system layout

A

MBR + partition table + disk partitions

-> boot block + superblock + free space mgmt + i-nodes + root dir + files and dirs

58
Q

MBR

A

master boot

59
Q

partition types

A

primary
extended
sub-partitions

60
Q

contiguous allocation

A
  • Store files as a contiguous stream of bytes → each file occupies a set of contiguous blocks
  • Requires max file size, prone to fragmentation
61
Q

contiguous allocation +/-

A

+ easy file access
+ for sequential access: the fs remembers the disk address of the last block referenced
+ for direct access: access block b + i (block start + index)

  • finding space for a new file is difficult
  • (external) fragmentation
62
Q

block-based allocation strategies

A

linked list
FAT
indexed disk space allocation
i-nodes
i-nodes with indirect blocks

63
Q

linked-list alloaction

A

each file is a linked list of disk blocks
the dir contains a pointer to the first and last blocks of the files

+ no fragmentation
+ the size of a file doesn’t need to be declared when the file is created
+ creating, reading and writing to files can be easily done

  • efficient only for sequential access files
  • requires a lot of space - pointers
64
Q

File allocation table

A

linked list allocation using a file allocation table in main memory
the table has an entry for each disk block and is indexed by block number
the dir entry contain the block number of the first block of the file
the table entry indexed by that number contains the block number of the next block in the file

65
Q

allocating a new block to a file with FAT

A
  1. find the first 0-valued table entry
  2. replace the previous end-of-file value with the address of the new block
  3. replace 0 with the end-of-file value
66
Q

indexed disk space alloacation

A
  • the index block: an array of disk-block addresses (i-th entry points to the i-th block of the file)
  • dir entry: points to the index block
  • support direct access without external fragmentation
  • it suffers from wasted space
    • one whole block for each file
    • some larger files require bigger index blocks
      • linked scheme: multiple index blocks
      • multilevel index
      • combined scheme
67
Q

UNIX i-nodes

A
  • when a fs is created so is a specific amount if inodes
  • inode structure
  • file info:
    • mode: file type and permissions
    • owners: access details
    • timestamps: access, modification…
    • size block
    • count: num of blocks
  • block info:
    • direct blocks: direct address pointing to data blocks
    • single indirect: points to an index block
    • double indirect: points to one index block whick points to other index blocks → data blocks
    • triple indirect
68
Q

inode

A

a data structure in UNIX OS that contains important information about the files

69
Q

on-disk structures

A

boot control block
volume control block
directory structure per file system
per file FCV (inode)

70
Q

boot control block

A

one for each volume
empty -> no OS in volume

71
Q

volume control block

A
  1. number of blocks in the partition
  2. size of the blocks
  3. free blick count
  4. free-blick pointers
  5. free file control block count and fcb pointers
72
Q

directory structure per file system

A

file names and the associated inode numbers

73
Q

per file FCB (inode)

A

file attributes

74
Q

in-memory structures

A

the data is loaded at mount time and discarded at dismount
includes:
1. in memory mount table
2. the system wide open-file table
3. in-memory dir-structure cache
4. per-process open-file table

75
Q

in memory mount table

A

info about each mounted volume

76
Q

the system wide open-file table

A

contains a copy of the FCB of each open file

77
Q

in-memory dir-structure cache

A

holds the dir info that were recently accessed

78
Q

per-process open-file table

A

contains a pointer to the appropriate entry in the system-wide open-file table

79
Q

object-oriented techniques

A
  • allows very different file system types to be implemented
  • layers:
    1. the file-system interface
    2. VFS interface
      • consists of several dozen func- tion calls that the VFS can make to each file system to get work done
    3. the layer implementing the file system type
80
Q

directory implementation

A

linear list: simple but slow
hash table: faster but not scalable

81
Q

opening files

A

Before a file can be read, it must be opened. The operating system uses the path name to locate the directory entry which provides information for locating disk blocks

82
Q

File Attributes Storage

A
  • Attributes can be stored directly in directory entries or in i-nodes.
    • Storing in directory entries can be space-consuming.
    • i-nodes can make entries shorter with benefits in organisation.
83
Q

file name length

A
  • Modern systems support long, variable-length names.
    • Simpler designs set a fixed character limit (e.g., 255), but this is inefficient.
    • Variable-size directories accommodate length differences but can lead to unused space gaps.
84
Q

directory Entry Structures

A
  • Fixed-size entries with a heap for file names ensure new entries fit but require additional management.
  • Directories are typically searched linearly, which can be slow for long directories.
85
Q

improving Search Efficiency

A

Hash Tables: Allow faster searches by hashing file names to specific entries
Caching: Results of previous searches can be stored and quickly accessed, advantageous when a few files have frequent lookups

Trade-offs: Hash tables provide quicker lookups but are complex to administer. Caching speeds access at the cost of increased memory usage

86
Q

FAT12/FAT16 layout

A

boot sector + reserver + FAT #1 + FAT #2 + root dir + clusters

  • Reserved later used for FS information
  • FAT #1/#2: preallocated File Allocation Tables
  • Preallocated root directory
  • Clusters represent addressable blocks:
    • FAT contains chains indexed by starting cluster
87
Q

FAT entry

A

32 bytes
file name 8
extension 3
attributes 1
reserved 10
time 2
date 2
first block number 2
size 4

88
Q

UNIX dir entry

A

16 bytes
i node 2
file name 14

  • Directory entry stores only name and #i-node
  • Attributes are stored in the i-node
  • One i-node per file, first i-node is the root
89
Q

hard links

A

reference a shared file (i-node)
- The file is removed only with no hard links left
- Space-efficient (only 1 directory entry per hard link)
- Good to manage shared files across owners

90
Q

soft (or symbolic) links

A

reference a file name
- Removing the file renders its soft links invalid
- Less space-efficient (need 1 i-node per soft link)
- More flexible (can reference file names across FSes)

91
Q

block size

A
  • Important speed-space tradeoff to consider
  • Larger block sizes allow transferring more data per block
    • Better overall data rates
  • Smaller block sizes reduce space overhead for small files
    • Less fragmentation
92
Q

free block management

A
  • free-space list
  • creating a file
    1. search list
    2. allocate the new list
    3. space is removed from the list
  • deleting a file
    1. the disk space is added to the free list
  • bitmap
    • each block is represented by 1 bit
      • 1 - free, 0 - allocated
    • easy finding the first free block
    • unless kept in main memory it is inefficient
    • size is fixed even if disk is full
  • linked list
    • all the free blocks are linked together
    • less metadata the fuller the disk gets → more efficient
  • grouping
    • store the addresses of n free blocks in the first free blocks
    • the first n-1 of these blocks are actually free, the last block contains the addresses can now be found quickly
93
Q

minimising disk access with caching

A
  • Buffer cache: Cache disk blocks in RAM
  • Page cache: Cache VFS pages (before going to driver)
    • Often same data as buffer cache, so OS may merge them
94
Q

cache management

A
  • Caches have LRU semantics adapted to:
    • Blocks with poor temporal locality
  • Critical blocks for file system consistency
  • Write-through caching vs. periodic syncing
  • Disk read-ahead: read blocks that may be used soon
95
Q

minimising disk access with log-structure fs

A
  • idea: Optimise for frequent small writes
  • Collect pending writes in a log segment with i-nodes, entries, blocks
  • Segments are often flushed to disk and can be large (e.g., 1MB)
  • i-node map to find i-nodes in the log
  • Garbage collection to reclaim old log entries
96
Q

Minimise Seek Time: HDD Layout Management

A
  • Not an issue for modern SSDs
    • Seek time is constant
  • Important for traditional HDDs
  • Different strategies to minimise HDD seek time:
    • Try to allocate files contiguously
    • Defragment disk
    • Store small file data “inline” in i-node
    • Spread i-nodes over the disk rather than at the start
97
Q

disk failure

A

bad blocks
whole-disk errors

98
Q

power failure

A

(meta)data inconsistency written to disk

99
Q

software bugs

A

bad (meta)data written to disk

100
Q

user errors

A

rm *.o
lost or stolen computer

101
Q

remote file system failure modes

A

network failures
failures in the connection
networking implementation issues
recovery
1. maintain a state at both client and server side
2. stateless DFS

102
Q

fs backups

A
  • Backups to disk are generally made to handle one of two potential problems:
    1. Recover from disaster
    2. Recover from stupidity
103
Q

backup properties

A

Incremental(changes from last backup) vs. full(all data)
Online vs. offline
Physical(disk blocks) vs. logical(files)
Compressed vs. uncompressed
Local vs. remote

104
Q

fs consistency check

A
  • Programs like fsck and chkdsk try to find (and fix) consistency errors in file
    system metadata:
    • Corrupt values
    • Used blocks also marked as free
    • Blocks not marked as free nor used
    • Blocks being used multiple times
105
Q

journaling file systems

A
  • Idea: Use “logs” for crash recovery
  • First write transactional operations in log
    • E.g., removing a file:
      • Remove file from its directory
      • Release i-node to the pool of free i-nodes
      • Return all disk blocks to pool of free blocks
  • After crash, replay operations from log
  • Single operations need to be idempotent
  • Should support multiple, arbitrary crashes
  • Journaling widely used in modern FSes (e.g., Ext4, NTFS)
106
Q

creating a file system

A
  • format partition
  • create basic, FS specific, layout
  • unix: mkfs