Storage & File systems Flashcards

1
Q

What’s better to use FD or filename?

A

FD versions are more secure in some sense
• Because association of FD to underlying file is immutable
– Once an FD exists, it will always point to the same file

• Whereas association between file & its name is mutable

• Using names might lead to TOCTTOU (time of check to time of use)
races

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 6 types of POSIX files?

A
– Regular file
– Directory
– Symbolic link (= shortcut), a.k.a. soft link
– FIFO (named pipe)
– Socket
– Device file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Access control list (ACL)?

A

Most OSes/filesystems support some form of ACLs
– Many groups/users can be associated with a file
– Each group/user can be associated with the 3 attributes (r/w/x)
– Or more, finer attributes (“can delete”, “can rename”, etc.)

CON: not part of POSIX

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Is there a difference between a filename and filepath in UNIX?

A

No, filename = filepath = path

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Is the file and filename the same thing in UNIX?

A

No.
– In fact, the name is not even part of the file’s metadata
– A file can have many names, which appear in unrelated places in the
filesystem hierarchy
– Creating another name => creating another “hard link”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Can we have hard links to directories?

A

– Hard links to directories are usually disallowed & unsupported by the
filesystem (though POSIX does allow directory hard links)
– => Acyclic graph (no circles)

Still, all filesystems that adhere to POSIX provide at least
some support to directory hard links
– Due to the special directory names “.” and “..”
– What’s the minimum number of hard links for directory?
• 2 (due to “.”)
– What’s the maximum?
• Depends on how many subdirectories nest in it (due to “..”)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the difference between a hardlink and a soft link?

A

Unlike hard links
– Which point to the actual underlying file object
• Symlinks (“shortcuts” in Windows terms)
– Point to a name of a “target” file (their content is typically this name)
– They’re not counted in the file’s ref count
– They can be “broken” / “dangling” (point to a nonexistent path)
– They can refer to a directory (unlike hard links in most cases)
– They can refer to files outside of the filesystem / mount point
(whereas hard links must point to files within the same filesystem)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the unlink function removes?

A

Will remove the symlink, not the target file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s the inode?

A

The OS data structure that represents the file
nternally, file names “point” to inodes
• This inode is determined via the path-resolution algorithm

The inode contains all the metadata of the file

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What’s a directory file?

A

A simple flat file comprised of directory entries (dirents).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Where is the name of the file stored?

A

In the directory file, not the inode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Do symlinks have there own inode?

A

POSIX doesn’t specify whether a symlink should have an
inode.

But filesystems often define an inode per symlink, it then points to the files “real” inode, or if the name is to long then to a data block where a pointer to the “real” inode lays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the lower bound for the time complexity of Path resolution process?

A

n. Because finding each individual directory component

along the path may also be a linear process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What’s the block/sector size in VSFS?

A

4kB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the layout of VSFS?

A
0 block = superblock
1 block = inode bitmap
2 block = data blocks bitmap
blocks 3 - 7 = Inode table
blocks 8 - 63 = Data blocks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the superblock?

A

Contains information about the particular filesystem

Location of the superblock (of any FS) must be well-known

17
Q

What’s a disk partition?

A

Partition = contiguous disjoint part of the disk that can host a filesystem

18
Q

Given inumber (=index of inode in table), how can we find inode block?

A

sector = (inodeStartAddr + (inumber x sizeof(inode_t)) / blockSize

19
Q

How is Multi-level index (in the classic Unix FS) implemented?

A
  • 12 pointers in inode point directly to data blocks
  • Single-indirect pointer points to a block completely comprised of pointers to data blocks
  • Double-indirect
  • Triple-indirect

sum = 15 pointers, giving a total coverage of 4TB. (assuming each pointer is 4B).

20
Q

Do the indirect pointer blocks included in the file size?

A

No.

21
Q

What’s an extent?

A
  • Extent = contiguous area comprised of
    variable number of blocks
    – inode saves a list of (pointer, size) pairs
22
Q

Describe FAT layout

A

– One table for all files
– One table entry for every disk block
– -1 (0xFFFF) marks “the end”
– Directory file: content of each entry points to start of file

23
Q

Is the FAT table copied to memory?

A

FAT (table) is copied to (cached in) memory
– But of course must also be saved on disk
– Solves random access problem of file pointers

24
Q

Describe the pros and cons of FAT.

A

Pros
– Simple to implement:
• File append, free block allocation management, easy allocation
– No external fragmentation
– Fast random access since table is in memory (for block pointers, not
necessarily for the blocks themselves)

Cons
– File contiguity can be lost (this is why extents were invented)
– Table can be huge
• 32 GB (disk) / 4KB (block) = 2^5 x 2^30 / 2^12 = 2^23 = 8 M
• Assuming 4B pointer, this means 32 MB table

25
Q

What’s RAID 0, pros and cons?

A

Non-redundant disk array
– Files striped evenly across N ≥ 2 disks

Pros
– High read/write throughput
- Faster aggregated seek time, that is
Cause we can do things concurrently.

Con
– Any disk failure results in data loss

26
Q

What’s RAID 1, pros and cons?

A

• Mirrored disks
– Files are striped across half the disks
– Data written to 2 places: data disk & mirror disk

Pros
– On failure, can immediately use surviving disk
– Read performance: similar to RAID 0 (can read concurrently as well) Write is twice slower.

Cons
– Wastes half the capacity

27
Q

What’s RAID 4, pros and cons?

A

• Use parity disks
– Each block (= multiple of sector) on the parity disk is a parity function
(=xor) of the corresponding blocks on all the N-1 other disks

• Failure => “degraded read”
– Read remaining disks plus parity
disk to compute missing data

• Pros
– In terms of capacity, less wasteful than RAID-1 (wastes only 1/N)
– Read performance similar to RAID-0, Write need to update parity block.
(although we don’t really need to read the old value first, just write).

28
Q

What’s RAID 5, pros and cons?

A

• Similar to RAID-4, but uses block interleaved distributed parity
– Distribute parity info across all disks

• Pros
– Like RAID-4, but better because
it eliminates the hot spot disk
– E.g., when performing two small
writes in RAID-4
• They must be serialized
– Not necessarily so in RAID-5
• => better performance
29
Q

What’s RAID 6, pros and cons?

A

• Extends RAID-5 by adding an additional “parity” block
– Ap and Aq must be independent of each other, algebraically speaking

• Pros & cons relative to RAID-5
– Can withstand 2 disk failures (2 equations, can find 2 variables)
– But wastes 2/N rather than 1/N of the capacity

30
Q

What’s RAID 2&3, pros and cons?

A

Like RAID-4

– But in bit and byte resolution, respectively

31
Q

What does the RAID n+k means?

A
  • n blocks of regular data

* k blocks that provide redundancy

32
Q

מהם המטמונים שלינוקס שומרת בהקשר של מערכות קבצים?

A
  1. מטמון הדפים (page cache) – עבור המידע של איזור הנתונים.
  2. מטמון inodes – עבור המידע של טבלת ה-inodes.
  3. מטמונים נוספים עבור ה-bitmaps או מבני נתונים אחרים של מערכת הקבצים.
33
Q

מה המבנה של תיקיות בFAT?

A

תיקיות הן קבצים רגילים המכילים רשימה של רשומות שבכל אחת יש את השדות:
filename, metadata, starting block
כל רשומה מכילה את המטאדאטה שלה לכן הinode כאילו מוטמה בתוך הרשומה.

34
Q

האם FAT תומכת בקישורים קשים?

A

לא. כי המטאדאטה של הקובץ שמור ברשומה של התיקיה.