Interlude: File and Directories Flashcards
Persistent memory
Hard disk drives, solid-state storage devices stores memory permanently and keeps memory intact even without power.
Two key abstractions in the virtualization of storage: file and directory.
File: linear array of bytes, each of which you can read or write. Each file has an inode number.
Directory: Also contains inode number, contains a list of pairs (user readable name, low level name), each entry in a directory refers files or other directories. Can make a directory tree or directory hierarchy.
Creating files
open system call and passing in the O_CREAT flag
example(
int fd = open(“foo”, O_CREAT|O_WRONLY|O_TRUNC, S_IRUSR|S_IWUSR);
File descriptor
integer, private per process and used in UNIX systems to access files
- read or write with the file descriptor
- fd is a capability, an opaque handle that gives you power to perform certain operations
- can think of it as pointer to an object of type file
->echo hello > foo
->cat foo
hello
open("foo", O_RDONLY|O_LARGEFILE) = 3 //open for reading and 64 bit offset? returns 3 because each running process already has three files open: stdin, stdout and sterr read(3, "hello\n", 4096) = 6 // write(1, "hello\n", 6) = 6 hello read(3, "", 4096) = 0 close(3)
Open File Table
each process maintains an array of file descriptors, each of which refers to an entry in the system-wide open file table.
each entry in this table tracks which underlying file the descriptor refers to, the current offset, and other details like if the file is readable and/or writable.
read() and write() vs lseek()
read and write will update the current offset
processes can use lseek() to change the value and enable random access to different parts of the file
lseek() system call function prototype: off_t lseek(int fildes, off_t offset, int whence);
first argument: file descriptor
second: offset, which positions the file offset to a particular location within the file
third: whence, determines how the seek is performed
= SEEK_SET, offset = offset bytes
= SEEK_CUR, offset = currentlocation + offsetbytes
= SEEK_END, offset = size of the file + offset bytes
- calling lseek() does NOT perform a disk seek, or when a read or write issued to the disk is not on the same track as the last read or write and thus necessitates a head movement
- lseek only changes a variable in OS memory that tracks at which offest its next read or write will start
“current” offset for each file a process opens
- abstraction of an open files is that is has a current offset
- updated in two ways:
implicitly: when a read or write of N bytes takes place, N is added to the current offset
explicitly: lseek which changes the offset as specified above
struct file { int red; char readable; char writable; struct inode *ip; uint off; }
Shared File Table Entries: fork()
parent process creates a child process with fork() and child adjust the current offset with a call to lseek() and then exits.
when a file table entry is shared, its reference count is incremented, only when both processes close the file or exit will the entry be removed
Shared File Table Entries: dup()
dup() call allows a process to create a new file descriptor that refers to the same underlying open file as an existing descriptor
- useful when writing a UNIX shell and performing operations like output redirection
Renaming Files
mv command in command line
- using strace on mv: it uses system call rename(charold, charnew)
- rename call is implemented as an atomic call, file is either old or new and no inbetween state
Meta data
- information about each file it is storing
- to see the metadata for a certain file, we can use stat() or fstat() system calls
- stat() takes a file name or a file descriptor and fill in a stat structure
- command line stat to see
- files keep this type of information in a structure called inode, or a persistent data structure kept by the file system that has metadata
- all inodes reside on disk, copy of active ones are usually cached in memory to speed up access
Removing files
-rm uses unlink(“name of file”) = 0 system call