Distributed File Systems Flashcards
Advantages of a DFS?
1) Larger storage space
2) Users have access to files from any computer
3) Data redundancy. If one server goes down, data may still be available elsewhere.
What is remote file transfer?
User explicitly has connect to remote machine. They can’t directly access files, and instead have to manually move files to and from. Consistency has to be maintained by the user. (i,e. ftp)
What is direct access approach?
Files are prefixed with their network location.
Remote users can directly access the files.
User must explicitly know which computer contains which file.
Replication is not possible (file paths have to be unique)
What is transparency in DFS?
Local and distributed file systems should behave the same way to the user (programmer)
Location transparency?
The user cannot tell the location of a file based on its name or path
Migration transparency?
Moving files around should not require changes to programs. Same as a local FS.
What is the difference between remote service and caching?
Remote service transfers individual blocks of data through network as accessed. (like a hdd, but networks can be slow)
Caching downloads larger parts of the file first before improving performance.
Client-server vs cluster-based model?
Files are saved on one server vs chunks saved on multiple servers.
Only client to server vs separate metadata server which holds the info about where the files are located on the data servers.
Benefits of remote service?
Easier to implement.
less of a consistency problem as accesses are happening at block level.
Less memory is needed.
behaves same as local storage
Benefits of cached?
More efficient, usually faster, scales better
What are the two ways to deal with consistency on DFS?
UNIX semantics - any changes made are visible immediately to all other processes
Session semantics - Changes are not visible until file is closed. Locks are required if multiple processes using file.
Stateful remote service?
Server knows about changes being made to files. Difficult if server crashes as all state info is lost.
Stateless does not record this information. A local machine has to keep track of open() close(), pointers, etc, and send this information with every request.
Is NFS stateless?
It was before version 4.
Which operating systems will NFS work with?
Most will work. Does not have to be a homogeneous system.