P4L2. Distributed File Systems Flashcards

1
Q

What is the upload/download model?

Examples?

Pros/Cons?

A

Client downloads a file from the server, performs updates on it locally, then uploads the file.

Example: FTP, SVN

Pros

+ local reads/writes at client

Cons

  • entire file download/updload even for small accesses
  • server gives up control
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the true remote file access model?

A

Every access to remote file goes to the server, nothing done locally

Pros

+ file access centrilized, easy to reason about consistency, multiple clients can’t overwrite a file at the same time

Cons

  • every file operation pays a network latency cost (even reading only since client cannot cache file)
  • limits server scalability b/c everything has to go through server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a stateless file server?

Pros/Cons?

A

Stateless means the sever doesn’t keep any information (e.g., which clients access which files, how many clients there are, etc). Every request has to be self-contained (include everything it needs to do its work like file name, offset, data).

Pros

+ no resources used on the server side to maitain state (CPU/MM)

+ resilient: on failure, just restart

Cons

  • cannot support caching and consistency management (we need state to do this)
  • every request self-contained => more bits transferred to describe request
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a stateful file server?

Pros/Cons?

A

A server that keeps state needed to track what is cached/accessed (e.g., who had portions of the file cache, who has written to a file, etc)

Pros

+ can support locking, caching, incremental operations

Cons

  • need checkpointing and recovery mechanisms to handle failure
  • overheads to maintain state and consistncy => depends on caching mechanism and consistency protocol
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is caching state in a DFS?

A

Clients can locally maintain a portion of state (e.g., file blocks).

Clients can locally perform operations on cached state (e.g., open/read/write)

A coherence mechanism is require to keep the cached portions of files consistent with the server representation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is “UNIX semantics”?

A

Every write is visible immediately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is “session semantics”?

A
  • write-back on close(), update on open()
  • easy to reason about, but may be insufficient
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is “periodic updates”?

A
  • client writes-back periodically => clients have a “lease” on how long they can used the cached data (not exclusive necessarily)
  • server invalidates periodically => provides bounds on inconsistency. easier to correct conflicts b/c they are fewer & smaller
  • augment with flush()/sync() API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is “immutable files”?

A
  • never modify, new files created
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is replication?

Pros/Cons?

A

Each machine holds all files

Pros

+ load balancing

+ availability

+ fault tolerance

Cons

  • writes become more complex (sychnronously write to all, or write to one then propagate to others)
  • replicas must be reconsiled (e.g., voting)
  • scalability (machines only get so larage)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is partinioning?

Pros/Cons?

A

Each machine has a subset of files

Pros

+ availabiity vs single server design

+ scalability w/ file system size

+ single file writes are simple

Cons

  • on failure, lose portion of data
  • load balancing harder; if not balanced, then hot spots possible
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly