Introduction to OS Flashcards
What are the key roles of an operating system?
- Hides the complexities of the underlying hardware from both the applications and the application developers (abstraction)
- Manages the underlying hardware on behalf of the executing applications (arbitration)
- Provides isolation and protection to multiple applications
Which of the following are components of the OS?
- file system vs file editor
- device driver or cache memory
- scheduler or web browser
Components of OS: file system, device driver, scheduler
Not components: file editor, cache memory, web browser
What’s the distinction between OS abstractions, mechanisms, and policies?
Abstraction are entities that represent other entities, often in a simplified manner such as process, thread, file, socket, memory page
Policies are the rules around how a system is maintained and resources are utilized such as least-recently used (LRU), earliest deadline first (EDF)
Mechanism are the tools by which policies are implemented, e.g., create, schedule, open, write, allocate
What is the principle of separation of mechanism and policy?
The idea behind separation of mechanism and policy is that how you enforce a policy shouldn’t be coupled to the policy itself. Since a given policy is only valid in some contexts or during some states of execution, having a mechanism that only suits that policy is very brittle. Instead, we should try to make sure that our mechanism is able to support a variety of policies, as any one of these policies may be in effect at a given point in time.
What is the principle of optimize for the common case?
We ensure that the most frequent path of execution operates as performantly as possible. This is valuable because:
- It’s simpler than trying to optimize across all possible cases
- It leads to the largest performance gains as you are optimizing the flow that by definition is executed the most often
We need to ask the following questions to understand the common case:
- Where will the OS be used?
- What will the user want to execute on the machine?
- What are the workload requirements?
What happens during a user-kernel mode crossing?
The CPU switches from user mode (where application runs with limited access to system resources) to kernel mode (OS has full access to hardware and memory allowing the application to execute privileged operations, e.g. accessing hardware, managing memory, or performing I/O operations, through a system call.
What are some of the reasons why user-kernel mode crossing happens?
It occurs anytime an application needs access to underlying hardware, be it physical memory, storage, or I/O devices, for example:
- read from a file
- write to a file
- listen on a socket
- allocate memory
- free memory
What is a kernel trap and why does it happen? What are the steps that take place during a kernel trap?
A kernel trap is a signal that the hardware sends to the operating system when it detects that something has occurred.
A kernel trap will occur if the hardware detects an attempt to perform an unprivileged operation, which is basically any operation that occurs when the special bit is not set on the CPU. As well, the hardware may issue a trap if it detects an attempt to access memory that requires special privileges.
During a kernel trap, the hardware sends a signal to the operating system, which initiates controls and invokes its trap handler. The operating system can then examine the process that caused the trap as well as the reason for the trap and make a decision as to whether it will allow the attempted operation or potentially kill the process.
What is a system call and how does it happen? What are the steps that take place during a system call?
A system call is an operation that the operating system exposes that an application can directly invoke if it needs the operating system to perform a specific service and to perform certain privileged access on its behalf.
Examples of system calls include open, send, mmap.
When a user process makes a system call, the operating system needs to context switch from that process to the kernel, making sure it holds onto the arguments passed to that system call. Before the kernel can actually carry out the system call, the trap bit on the CPU must be adjusted to let the CPU know the execution mode is kernel mode. Once this occurs, the kernel must jump to the system call and execute it. After the execution, the kernel must reset the trap bit in the CPU, and context switch back to the user process, passing back the results.
On a 64-bit Linux-based operating system, which system call is used to:
- Send a signal to a proces?
- Set the group identity of a process?
- Mount a file system?
- Read/write system parameters?
- Send a signal to a proces: kill
- Set the group identity of a process: setgid
- Mount a file system: mount
- Read/write system parameters: sysctl
What is a monolithic OS? What are the design decisions and performance tradeoffs?
Monolithic OS - every possible service that any one of the applications require/any type of hardware will demand, is already part of OS
- Pros: Everything included; inlining, compile-time optimizations
- Cons: No customization; Not too portable/manageable; Large memory footprint (which can impact performance)
What is a modular OS? What are the design decisions and performance tradeoffs?
Modular OS - has a number of basic services and API part of it, but everything can be added as a module. OS is easily customized which particular file system or scheduler the OS uses. The OS specifies certain interfaces that any module must implement in order to be part of the OS. Depending on the workload, a module can be installed that implements the required interface.
- Pros: Maintainability; Smaller footprint; Less resource needs
- Cons: All the modularity/indirection can reduce some opportunities for optimization; Maintenance can still be an issue as modules from different codebases can be slung together at runtime
What is a microkernel OS? What are the design decisions and performance tradeoffs?
Microkernel OS - only requires the most basic primitives and can support basic services. Everything else, all other software components, applications and software that we typically think of as an OS component like file system and device driver, are run outside the OS kernel at user level (unprivileged level). This type of OS requires a lot of inter-process interactions, and therefore supports IPCs as one of its core abstractions and mechanisms, along with address space and threads.
- Pros: Size; Verifiability (great for embedded devices)
- Cons: Bad portability (often customized to underlying hardware); Harder to find common OS components due to specialized use case; Expensive cost of frequent user/kernel crossing
What is a process?
A process is a state of a program (an application on disk/flash memory/cloud; a static entity) when executing and loaded in memory and becomes an active entity.
It represents the execution state of an active application.
What does a process look like?
A process encapsulates the following elements (process states) for running an application:
- stack: dynamic part of the address state that grows or shrinks during execution in LIFO order
- heap: dynamically created during execution
- data and text: static state when process first loads
Every single element has to be uniquely identified by its address.
What is the difference between stack and heap?
The stack is a region of memory used for managing function calls and local variables. Memory is allocated and deallocated automatically in a Last-In, First-Out (LIFO) manner. The speed is very fast due to its well-structured allocation pattern. It’s limited in size, typically defined at the start of the program. Variables exist only as long as the function they belong to is running.
The heap is a region of memory used for dynamic memory allocation. Memory is allocated and deallocated manually, e.g., malloc() and free(). Speed is slower than the stack due to fragmentation and manual management. Size is much larger than the stack but depends on available system memory. The variables persist until explicitly deallocated, making heap memory useful for objects that need to outlive function calls.
What is an address space?
An address space is an OS abstraction used to encapsulate all of the process states defined by a range of addresses. It is an “in memory” representation of a process.
What is the purpose of the page tables?
A page table is a data structure used in a virtual memory system to map virtual addresses to physical addresses (in physical memory or DRAM).
If two processes, P1 and P2, are running at the same time, is it possible for them to have the same virtual address space range?
Yes. The OS underneath will map P1’s virtual address to some physical location, and P2’s virtual address to other physical location. Decoupling the virtual addresses that are used by the processes from the physical addresses where data actually is makes it possible for different processes to have the exact the same address space range and to use the exact same addresses.
How does the operating system know what a process is doing?
At any given point in time, the CPU knows where in the binary (instruction sequence of the application) the process currently is via the program counter (PC). The PC is maintained on the CPU while the process is executing in a register. Process stack also defines what a process is doing. The stack pointer (SP) defines the top of the stack.
For every process, the OS maintains a process control block (PCB).
What is a process control block (PCB)?
It is a data structure that the OS maintains for every one of the processes that it manages. It is created when the process is initially created itself. Certain fields of the PCB are updated when process state changes while other fields change too frequently. Some fields in the PCB include:
- process state
- process number
- program counter
- registers
- memory limits
- list of open files
- priority
- signal mask
- CPU scheduling info
How are PCBs used?
PCBs are used by the OS for context switching between processes. Saving and restoring process states are done via PCBs.
What is a context switch between processes?
It is the mechanism used by the OS to switch the execution from the context of one process to the context of another process.
Why is context switch between processes expensive?
- There are direct costs which is the number of cycles that have to be executed to load and store all the values of the PCBs to and from memory.
- Indirect costs - When we context switch to another process, some or even all of the data belonging from the previous process in the processor cache will be replaced to make room for the data needed by the current process. The next time the previous process is scheduled to execute, its data will not be in the cache. It will spend much more time reading data from memory, which will incur cache misses. This is called cold cache.
Running in cold cache is bad because every single data access requires much longer latency to memory and it slows down the execution of the process. We therefore want to limit context switching between processes.