Exam 1 Flashcards
What is a Virtual Machine?
A piece of software that runs as an application on your OS. It can be configured to run its own OS internally and allows multiple OS’s to run on the same computer.
It is run in the user space. Also called a “Guest Operating System”.
What are the three layers of the computer?
- Bottom Layer: hardware the computer runs on
- Operating System: the Os that runs on your computer
- User Space: where we run different applications
Host OS
The original operating system that comes on your machine. It’s also known as the OS that runs on “bare metal”.
Why do we use VM’s?
- They allow multiple OS’s to run on the same machine.
- They allow developers to sandbox and test code on a specific operating operating system version.
- They allow for the cloud computing revolution.
What are the primary differences between Code Repositories and DropBox-like file sharing systems?
DROPBOX:
- Files are automatically synchronized.
- Resolves conflicts by creating two copies.
- Some support for versioning but no branches.
CODE REPOSITORIES:
- Requires programmer explicit push and pull.
- Automatic merging of source code.
- Excellent support for branches and versioning.
Programmers use code repo’s over dropbox-like systems because:
- DropBox systems automatically sync changes meaning you don’t have control over reconciling version differences.
- Dropbox systems are for code editing so you end up with many file copies instead of merged files with updates from multiple places.
- Code repos have great support for branching and versioning.
What do the following vagrant commands do?
- vagrant up
- vagrant ssh
- vagrant suspend
- vagrant destroy
- vagrant halt
- logout
- vagrant up: Starts VM and provision according to vagrant file.
- vagrant ssh: Accesses your VM such that all future commands occur on VM.
- vagrant suspend: Saves the VM state and stops the VM.
- vagrant destroy: Removes all traces of the VM on your computer.
- vagrant halt: Gracefully shuts down the VM OS and powers down the VM.
- logout: Stops ssh connection and returns to your terminal.
What are the three primary sections of a vagrant file?
- Define: provide the host name for the VM and specify the guest OS we desire.
- Provisioning: install the tools (And correct versions of tools) that we’d like. Some examples are valgrind, gdb, etc.
- Machine: specify the # of CPUs and RAM that we want to allocate.
Operating System (OS)
A system that provides the interface between a user’s application and the hardware. It is the process scheduler for the computer.
- Software that usually comes pre-installed on a computer. It includes a graphical user interface for you to interact with.
- The OS allows you to run multiple applications at the same time and switch seamlessly between those apps.
- It stores data on persistent storage devices.
- It allows apps to interact with external devices like the monitor, mouse, etc.
- OS does memory management which allows multiple apps stored at the same time in memory to be executed.
The Kernel
Forms the core of the operating system and provides all the key functionalities for managing interactions between user-level programs.
The Shell
A command interpreter for the operating system.
Data Storage in the OS
All data is stored in the form of files, and the OS maintains file names, sizes, locations, directory structure, and file access rights.
What are the two main functions of the OS?
- Abstract away the messy details of the hardware so that user programs can be written without needing to understand hardware interaction.
- Resource management.
User Mode
In this mode, each application can only run instructions that affect its own applications. User mode executes the functions in the application mode.
Kernel Mode
Handles operating system functions that exist in the code. When these functions are run, the computer flips to OS mode. Only happens for select system calls and carries a lot of overhead. The division between this an the user mode helps us keep the operating system safe from errors and attacks.
What is a Process?
A program in execution. When a program runs, the OS needs to keep the program’s contents in memory and on disk. Processes run sequentially. Multiple processes can be running in the same application or at the same time. For example, running an application twice is treated as two separate process contexts.
Process Context
When a process stops, we get a snapshot of it. This includes the process’ data, memory utilization and execution context. This is savable so that the OS can save its place and then leave and come back.
Why does the OS run apps with the process abstraction?
At any point in time, the OS has to manage many applications at the same time. The abstraction allows it to package processes up into one bundle of information to make transitioning between them easier.
The OS also has its own processes for things like memory management.
How big is the address space of a process?
Each process has an address space that goes from 0 to 2^64 - 1 bytes.
What part of memory are applications run in?
The CODE section. Also sometimes called the “text” section because it’s read only.
Program Counter
Holds the address of the next instruction to execute. It is part of the code segment.
Where is temporary data from a program stored?
The STACK.
Temporary data gets put onto the stack. It grows from the top down. New values get pushed onto it, and when the function call ends, the stack stops tracking the data. We track the bottom of the stack to ensure we don’t hit the heap and overwrite data there.
What are Data Segments?
These are the global variables. They are defined using the static keyword and are determined at compile time.
The Heap
Stores Dynamically Allocated Memory and it grows at run time as data is added to it. It can grow and shrink as data is added/freed.
What is the order of memory segments from top to bottom?
Stack (grows downward)
Heap (grows upward but can have items randomly placed)
Static Data
Code
What is the PID?
Stands for Process ID.
Each process gets its own. The OS maintains a table or array of process control blocks (PCB). Each entry points to a PCB.
What is a PCB?
Stands for Process Control Block.
The PCB stores the context of a process. This includes:
- Stack Pointer
- PC
- Computer Register Values
- Segment Register Values
- Status
When a process is executing, only the hardware registers are updated. When the process pauses or ends, all of its information is stored in the PCB. The PCB does not update in real time, it just gets the values when the process pauses/stops. The PCB does not store the contents of memory, just the mapping to physical memory where they exist.
What is the PPID?
Stands for Parent Process ID.
Parent and child processes have their own virtual address spaces and can run independently, but they share some resources like file descriptors and semaphores.
Define the hierarchy of Linux processes
- The Root is the Init Process that starts when your OS first boots up.
- Then the System Processes are started. They perform internal OS management tasks, typically related to memory management.
- Then there is sometimes SSH server that listens for requests.
- Then comes the User Shell which accepts user commands.
- Any app processes the user runs.
What is the fork() system call?
Creates a new child process.
The child gets a PID of 100.
What is a Child Process?
It is created by forking from another process. It starts off as almost identical to its parent because it starts as a copy of the parent address space. It even starts with the same PC value which means they run concurrently. This is considered “lightweight”, a “copy on write” or “lazy copy”. It means the child shares the same memory location as the parent.
There is one big difference though. The parent gets a value of 100 set to its local Child PID variable. The child process gets a 0 to start. If it forks its own child, that grandchild gets a PID of 200. These values are NOT the same as the OS PIDs.
After forking, if the parent’s Child PID value is not 0, the child can run.
Process Memory Construction
A process’ memory is broken into smaller parts called pages. The child process gets its own page/pages when it modifies a page/pages.
The wait() call
A way for a parent to wait for a child to finish running. The parent forks the child, then it checks the PID is valid, and waits for the child. The child runs as a separate process, and then calls exit() which unblocks the parent.
An exit code of 0 indicates graceful exit.
The & sign that is seen in the wait call (wait(&status)) means “call by reference”. This copies the exit value of the child into the address pointed to by the status of the parent.
The waitpid(pid, &status, options) call
PID specifies the child, and options specifies whether or not the parent should wait or proceed.
SIGKILL
A signal to terminate a target process immediately. A process can kill another process only if both processes belong to the same user and if the “kill process” is owned by a superuser.
When a process is terminated, all of its resources are freed. If a parent terminates, its children become orphans. The child’s parents become the Init Process.
Zombies
Come up when the child process has stopped running, but the parent had not called wait. This leads to wasted resources in the kernel since the PCB of the child cannot be reclaimed until the parent calls wait(). It’s the equivalent to un-freed heap space.
Daemon
Processes that are system level and are for OS-specific tasks. The parent of these is the Init Process.
What are the steps for switching processes?
- P1 stops running so P2 can run.
- CPU → kernel mode. We go from user to kernel.
- OS starts running now that we are in kernel mode.
- OS copies CPU register values to P1’s PCB.
- Then the OS runs P2.
- It loads P2’s PCB values into the CPU.
- OS goes back to user mode.
- P2 starts running.
What is the life cycle of a process?
- Process is created.
- It is put into a ready queue to be run by the OS when it is ready.
- The process runs.
- The process may be blocked for some reason. In this case, it is blocked and then put back in the ready queue.
- The process terminates.
What happens when a hardware or software interrupt happens?
The OS handles the interrupt and blocked processes may become ready. The scheduler may choose to continue running the current process OR pick another process to run. After the OS handles the external event, it can continue the previously running process or select another.
How big of a time slice to most processes get?
100ms
What is the tradeoff for a process that gets a larger time slice than normal?
A large time slice means improved throughput since the OS wastes less time switching, BUT it comes at the expense of longer response time since other processes need to wait longer to run. It makes apps appear sluggish.
This is common to see with user input apps which require a larger time slice.
What are some of the types of processes that get higher priority?
- Interactive apps that block frequently because of system calls.
- User-defined priority, called “nice” commands in Linux.
- System Daemons (typically get higher priority).
- Processes run by a user that has higher priority over other users on the computer.
What happens when fork() is called?
A new identical clone of the parent is created.
Both the parent and child become immediately ready.
The parent may not be able to run after the fork call. The scheduler will decide.
In a multicore machine, they may run concurrently.
What happens when exec() is called?
A file is opened, its contents are loaded into memory, and the context is reset. The process goes to a ready state as soon as this happens (if the executable is already cached in memory). It might be blocked if opening or loading takes a while.
What happens when wait() is called?
If the child IS NOT running, the parent process is in a ready state. If the child IS running, the parent process is put into a blocked state.
What happens when exit() is called from within a child?
The child process terminates, its resources are released, it passes its exit value back to the parent, and the parent becomes ready. If the parent is blocked, then it waits in the ready queue until it is selected by the scheduler.
What is a file descriptor?
It represents an input or output device.
What are the arguments of the read() function?
ssize_t read(int fd, void *buf, size_t count);
What are the arguments of the write() function?
ssize_t write(int fd, const void *buf, size_t count);
Does buffering improve performance when reading/writing data?
Yes. Buffering results in improved performance because data can be read in larger blocks.
What happens to the process state if an error occurs with I/O?
What if the file for I/O is available?
What happens if it’s not available?
If an error is called, the process is put into a ready state.
If the file exists, the process is put into a ready state.
If the file does not exist or is not found, the process goes into a blocked state. If data gets presented later, it may not automatically go to the ready state. This is because the call requires trapping from the kernel.
What does it mean to “leak processes”?
It means that new children were forked before old ones terminated. This is VERY bad.
Where should malloced data be freed?
Unless you have a very good reason to do so, a function should avoid freeing memory that it didn’t allocate. OR it should avoid allocating memory while expecting the caller to free it.
Do does calling different functions within a process require a context switch?
No. You are staying inside the same process so it is not necessary.
What are System Calls?
System Calls are operations that run code in the kernel. There are some that can be called from user mode, but many more that are restricted to only kernel mode.
Every System Call has limited entry points in the kernel so that they don’t disturb any other code or data.
When a System Call happens, a context switch occurs. The caller is then put in the ready queue or the blocked queue.
System Calls are more expensive than regular function calls. This is because we have to context switch. Also, restarting a different process can be even more expensive because you have to load a different address space. This likely destroys the cache’s and virtual memory.
Lastly, System Calls usually have return values while interrupts do not.
What is a System Call Number?
A System Call Number is used by the system call dispatcher which is the gatekeeper of the system calls. A lookup table is used to match the call number with the right function (think of the TRAP tables from 593). Once the function is called, the PC gets set to the start of that function.
Hardware Interrupts
While System Calls are initiated by user programs, Interrupts are initiated by the machine.
Example: there is a pending operation on the disk that is now complete. A hardware interrupt is generated and picked up by the operating system.
Each interrupt has an interrupt handler. At this point, the OS then invokes the proper user-level operation that is waiting for the signal.
They have no return values because their interrupt is completely unrelated to the current process.
Common things that hardware interrupts for: inputs from devices, illegal instruction, clock pulse.
How is a Hardware Interrupt processed?
- The current user program stops running.
- Trap to the kernel to run the appropriate interrupt handler to process the interrupt. This is chosen based on the Interrupt ID sent with the hardware interrupt.
- The hardware interrupt handler then might be unblocking a process that was waiting for the completion of a disk operation.
- When the interrupt handler completes, the OS transitions back to user mode.
What happens in a process when a hardware interrupt occurs?
The process may not know if an interrupt will be happening. It also has no parameters for the stack and no return value.
An interrupt’s parameters are passed with the signal and are of no concern to the developer.
Returning from an interrupt just means restarting the initial process where it left off. This is what interrupts and system calls have in common: they both require context switching and remembering of the execution context so that the previous process can continue execution. Note though that the previous process may not start running first when the call/interrupt is done.
What is the general process for an interrupt?
- A process is running.
- An interrupt happens. CPU transitions from user to kernel mode.
- Interrupt signal is processed by the OS’s “Interrupt Service Routine.
- The ISR has all the info needed to process the interrupt.
- The process’ execution context is saved to the PCB.
- The interrupt ID from hardware is used by the ISR to determine the appropriate device driver to call.
- The device driver then identifies the correct handler for processing the interrupt.
- The handler finishes and a process is pulled from the ready queue, and we switch back to user mode.
Inter-process Signalling
P0 wants to send a signal to P1. The signal is delivered by going through the OS. We have to go through the kernel for this because the OS doesn’t allow processes to communicate to each other on their own. Processors are their own custom signal handlers. The OS simply relays the signal.
You can also process signals without a custom handler in the process, the OS also provides default handlers that take default actions.
What is a Signal?
A signal is an asynchronous software multiplication mechanism that notifies a process of an event. It’s like an interrupt. It can be initiated by another process or could come from the OS.
The simplest form of inter-process communication.