OC Flashcards
What is An Operating System? (Two functions)
An operation system is a software that performs 2 functions:
- Extended Machine
- Resource Manager
What is extended machine?
Extends the underlying machine architecture
(instruction set, memory org., I/O, and bus structure)
– E.g., disk driver instead of reading SATA 450-page manual
– Provides abstractions like files and processes
What is a resource manager?
- Manages resources such as CPUs, memories, timers,
disks and NICs - Allocate resources to programs
- Manages concurrent requests to resources, e.g., time
multiplexing on the CPU, space multiplexing on RAM
OS Examples UNIX
OpenBSD, FreeBSD, NetBSD, MacOS
OS Examples LINUX
- Debian (MX, Mint, Ubuntu),
- Redhat (RHEL, Fedora, CentOS)
- Arch (Manjaro)
- Gentoo (Pentoo)
- Android, …
————– not LINUX
Windows
What is the CPU? what is the loop (FDLE)?
The “brain” of the computer
In an infinite loop does:
* Fetch an instruction from memory
* Decode it
* Load its operands (if any) and
* Execute it
What does a CPU have?
-Each CPU has its instruction set: your
phones is completely different from the
one in your laptop
-CPU has a set of registers to store data
about to be used
CPU Registers
- Program Counter contains the address of the
next instruction to execute - Stack Pointer points to the top of the stack in
memory - The stack contains a frame for each function call that
has not returned yet - One frame contains input parameters, local variables,
… - Program Status Word (PSW) is set by
comparison instructions, it contains the condition
code bits
Memory hierarchy (R,C,MM,MD)
Registers, Cache, Main Memory, Magnetic Disk
What is L1 and L2 caching
L1 and L2 cache are small, high-speed memory storage areas that are built into the processor chip of a computer. The purpose of caching is to reduce the time it takes for the processor to access data from the slower main memory, which is usually DRAM (Dynamic Random Access Memory).
L1 cache (primary) is the smallest and fastest cache on the processor chip.
L2 cache (secondary) is a larger cache that is slower than L1 cache but faster than main memory.
Disks: OS as an Extended Machine
read & write: 13 parameters in 9 bytes
1) address of the disk;
2) number of surface;
3) number of cylinder;
4) number of track;
5) kind of recording mode
(continuous, interleaf, …);
6) what to do if the data is not found;
7) …
is the motor on? is the motor off?
Disk Speed
-Platters rotate at 5400, 7200, 10800 RPM
-It takes 1 msec to move the head from one
cylinder to the next
* 5 to 10 msec to move to a random one
-It takes 5 to 10 msec for a sector to be
under the arm
-Reading speeds are in the range of 50 to
160MBps
Booting-up facts
- The PC has a parentboard
- The Basic Input/Output System (BIOS) is
a program stored on the parentboard - BIOS has low-level software to use the
keyboard, screen, disks, etc. - BIOS is stored in flash RAM
BIOS - 6 Steps
- Checks how much RAM is installed and whether the
keyboard, etc., are working (USB, PS/2) - Scans for PCI and PCIe devices and installs them
- Checks in CMOS memory for a list of booting
devices and tries them
- Complementary Metal–Oxide–Semiconductor (CMOS) - Loads the first sector from the booting device
- Runs its code, which normally examines the
partition table at the end of the boot sector to
determine which partition is active - This results in running a secondary boot loader from
that partition which loads the OS
OS Booting up
- Queries the BIOS for the configuration info.
- Detect if the hardware configuration changed
since last restart - For each device, check if it has its driver
OS Types
Mainframe OS
Goliath computer! Heavily oriented toward
processing jobs at the same time
* Offer batch, transaction processing and
timesharing services
* E.g., OS/390, descendent of OS/360
Server OS
* Serve multiple users at once (Web, file printing)
* E.g., Linux server distros, Solaris, FreeBSD,
Windows Server
Multiprocessor OS
* Runs multiple CPUs into a single system
Personal OS
* E.g., Linux, Windows, FreeBSD, OpenBSD
Handheld OS
* E.g., Android
Embedded OS
* Control non-computer devices, e.g., oven
* End-user can’t install software on it
* E.g., Embedded Linux
Sensor-Node OS
* Sensors have limited power
* Run a tiny OS, often event-driven
- Respond to external event, or an internal clock
* E.g., TinyOS
Real-time OS
* Need to perform tasks within limited time
* Tasks are triggered by events
* Hard real-time OS: if an action absolutely
must occur at a certain moment or within a
certain range
-E.g., eCos
-Users can’t add their software
* Soft real-time OS: it’s acceptable, but not
desirable, to occasionally miss a deadline
What is a Process?
A process is a program in execution
A process has its address space
It also has a set of resources:
* Registers (including the program counter and stack pointer)
* List of open files
* Lists of related processes
* etc.
One main task of OS
One of the main tasks of an OS is to
decide which process runs at an instant t
* And in which processor core
-For example, you might be using Firefox
and LibreOffice at the same time on a
single-core machine
After running Firefox for a certain time, the
OS needs to pause Firefox to run LibreOffice
- In order to be able to resume Firefox once
LibreOffice’s time is over, the OS needs to
store the status of Firefox before pausing it:
* Save its address space (core image) as well as
* Other information (registers content, file pointers,
etc.)
- Process information, other than its address
space, is usually stored by the OS in a data
structure inside a process table
What is a linux time slice?
In Linux, the time slice, also known as the time quantum, is the amount of time allocated to a process for it to execute on the CPU. The time slice is determined by the kernel scheduler and can vary depending on the specific system and kernel configuration.
By default, the time slice in Linux is usually around 10 milliseconds (ms)
What happens when “gedit” is entered in shell?
this effectively asks the “shell”
process to create a new process and run “gedit” in it
- In this scenario, “gedit” is a child of “shell”. “shell” is its
parent
- A process could create many processes to help it
perform a complex operation, e.g., multiplication of
large matrices
- Linux processes form a tree
- The parent and children processes need to co- ordinate the work
* They need to use some form of inter-process
communication, e.g., by sending signals
What is a thread?
A thread is one flow (thread) of execution
within a process
A process can have one or more threads
in it
- E.g., LibreOffice
Threads share the same memory space,
making it easier for them to communicate
than different processes
Address Space
A protection mechanism is required to
prevent a program from accessing
another’s space
* Provided by the hardware, managed by OS
- Virtual memory, use the disk as an extension
of the actual physical memory
The address space as an abstraction of
the physical memory
Files
An abstraction of a collection of bytes
Organised into folders
System calls to manipulate both
Notions of
* Path
* Working directory
* File descriptor
*nix special files
- Block: randomly addressable, e.g., disks
- Character: takes input or outputs a text stream
-E.g., keyboard, modem, printers, etc. - Under /dev (e.g., /dev/lp could be a printer)
Kernel Mode
Kernel Mode
- Code has access to all CPU instructions
- Code has access to all Memory addresses
- Only accessible to most trusted parts of the OS
User Mode
- Code does not directly access hardware or
memory - Uses OS API to do so
- Crashes are recoverable
The Shell
Most common on Linux is bash.
Sometimes Desktops are called graphical
shells.
They are sorted:
application programs
system programs
hardware
System Calls
The lowest interface provided by the OS to
users
System calls to manipulate processes, files,
memory, inter-process communication, etc.
Process management: fork, waitpid, execve,
exit, …
File management: open, close, read, write,
lseek, stat, unlink, …
Folder management: mkdir, rmdir, mount,
umount
Memory: brk, malloc, free, …
Monotholic Systems
Runs entirely in kernel mode
* Including drivers
* Any crash in a procedure brings the system down
Each procedure in the system can call any other
* No layering
* No restrictions
Can have minimum structure
* A service procedure per system call
* A service utility used by one or more service procedures
E.g., Unix, Linux, OpenVMS, MS-DOS, Windows 9x
Layered Systems (OUIOMP)
The system is divided into layers
LAYER FUNCTION
5 the operator
4 user programs
3 input/output management
2 operator-process communication
1 memory and drum management
0 processor allocation and multiprogramming
Microkernel
2-10 bugs per thousand lines of code
- Linux kernel is around 5M lines of code
- Microkernel OS minimises code that runs
in kernel mode
Hybrid Kernel
Combines Monolithic and Microkernel approaches
- Runs almost all OS service in kernel space
- Used in Windows NT (including 10)
- Linus Torvalds doesn’t find enough difference
between hybrid and monolithic kernels
Time-Multiplexing
Computer resources could be shared
between processes/users using either:
* Time-multiplexing: processes take turns at the
resource
E.g., CPU core, printer (actual printing),
* Or Both (space-multiplexing)
Space-multiplexing
Computer resources could be shared
between processes/users using either:
Space-multiplexing: multiple processes can
use/share the resource at the same time
- E.g., hard-drive, RAM, printer’s buffer (if available)
* Or Both (Time-Multiplexing)
Process States - Which steps result in/from calling the
scheduling algorithm?
All! Running, Ready, Blocked
I/O Operations
I/O operations are usually much slower
than CPU instructions
What does the scheduler do?
The Scheduler chooses the next process
to run.
Process Queue
* The process queue(s) can be ordered in
different ways
- FIFO
- Priority
Fairness in processes?
- Fairness: each process gets a “fair share”
Throughput
- Throughput: number of processes that
complete their execution per time unit
Turnaround time
- Turnaround time: time from submission to
completion
Waiting time
- Waiting time: the amount of time a process
spends in the ready queue
Response time
- Response time: the amount of time from
submission to the first response
Non-preemptive
- A running process will release the CPU only
when it exits or is blocked
Non-preemptive scheduling is typically more suitable for batch systems than interactive ones as jobs are submitted in advance.
If no clock interrupts are available, non- preemptive scheduling is the only option
Preemptive scheduling
- A running process can be suspended after some
time - Requires a clock interrupt at the end of time
intervals (e.g., 50Hz, 60Hz)
Dispatcher
Switching the CPU from process A to a
new one
Context Switching steps
Steps
* Switch from user mode to kernel mode
* Save A’s state (e.g., registers)
* Select process B
* Load B’s state
* Start B
The time it takes is the Dispatch Latency
First-Come, First-Served (FCFS) Scheduling
The CPU is allocated to processes in their
order of arrival
* FIFO process queue
- Each process owns the CPU until it is
done or blocked
This is a non-preemptive algorithm
FCFS suitable for
* Batch systems
FCFS Advantages and Disadvantages
Advantages
* Simple to implement
Disadvantages
* Waiting time can be large if short requests
wait behind the long ones
* Not suitable for multiprogramming systems
Shortest-Job-First (SJF) Scheduling
For each process, determine the length of its
next CPU burst.
Run processors in their ascending burst
length order.
SJF is both a non-preemptive and preemptive algorithm.
SJF suitable for batch systems
In SJF, if A (L=20) is available at t=0 and B (L=1)
arrives at t=1
* A will run from 0 to 20, B runs from 20 to 21
SJF Advantages and Disadvantages
Advantage
* SJF minimises the average waiting time
Disadvantages
* Difficult to get a good estimate of the burst
length
- Easier if similar tasks ran in the past
* Starvation
- The longest process(es) might never get to run
Shortest Remaining Time next (SRT)
SRT takes into consideration the fact that processes
arrive at different times
* When a new process arrives, interrupt the running process
if its remaining time is longer than the new process
- With SRT
* A runs from 0 to 1
* At 1, B arrives, B’s length (1) is shorter than A’s remaining
time (19)
* Run B from 1 to 2, Run A from 2 to 21
SRT is preemptive
Typically used in interactive systems where response time is critical
Round-Robin (RR)
The most intuitive multiprogramming
algorithm
* Split the timeline into equally long segments
1. Give each process a segment
2. Once all present processes had a segment each, go
to 1.
* The length of the segment is called quantum
- Usually 10-100ms
* At the end of the timeslot, the process is pre-empted
More suited for interactive systems
RR Facts
If the quantum is too short
* The cost of context switches becomes the
main performance criteria
If the quantum is too long
* Becomes FCFS
Performance criteria
* Length of time quantum
* Number of context switches
* Length of CPU bursts
RR Advantages & Disadvantages
Advantages
* Simple to implement (circular queue)
* Very suitable for multi-programming
Disadvantages
* Challenging to find the optimal value of the
quantum
* Assumes all processes are equally important
Priority Scheduling
Each process has a priority value
* Lower integer value higher priority
- The scheduler picks processes in their
descending priority order
Two variants
* Preemptive: pause the current process if a higher
priority one arrives
* Non-preemptive: current process cannot be
interrupted by higher-priority processes
Priority Scheduling Advantages & Disadvantages
Advantages
* Can define more sophisticated policies
- E.g., SJF can be seen as priority scheduling where
the priority value is the length of CPU burst time
Disadvantages
* Starvation for low-priority processes
- Can include an ageing factor: increase the priority of
waiting for processes with time
What is a semaphore?
Semaphores are an abstract data type used for synchronization between multiple threads or processes in a concurrent system.
Semaphores can be used to protect shared resources or critical sections of code, by ensuring that only one thread or process can access them at a time. They can also be used to signal events between threads or processes
What is true about process scheduling?
The waiting time is always larger than or equal to the response time.
What is the waiting time in process scheduling?
The time spent by a process in the ready queue waiting for CPU time
What is the response time in process scheduling?
The time between a user request and the first response produced by the system.
A page fault occurs when:
A requested virtual memory page is not present in RAM.
The address space of a process refers to the space it uses in the ______?
RAM
In an OS that implements paging, what statement is accurate?
The size of a virtual memory page has to be identical to the size of a page frame.
A binary file can be defined as:
A file that uses 8 bits out of each byte
For which of the following systems is nonpreemptive scheduling more suitable?
Batch Systems
In preemptive priority scheduling, if the current running process has a priority of 3 and a process of priority 10 arrives, then what does the scheduler do?
Nothing, the current process has a higher priority
“On average, only 50% of each memory page will contain useful data.”
Is this true?
False, only the last memory page of a process will, on average, contain 50% useful data.
Examples of “Character” special files include (select two)
- a network interface
- a terminal
Which of the following statements is correct:
1- SJF is suitable for interactive operating systems
2- SRT is preemptive
3- Both preemptive and non-preemptive variants of priority scheduling are suitable for a smartphone OS
Only 2
Is SRT non-preemptive or preemptive
preemptive
Which of the following statements is true? (select two)
- A process can have multiple threads in it
- A thread can have multiple processes in it
- Threads from the same process share the process’ local
variables - Processes from the same thread share the thread’s local variables
- A process can have multiple threads in it
- Threads from the same process share the process’ local
variables
A process references memory address 2000. If it is running on an OS that uses dynamic relocation where the base address of that process is 3000 and its limit is 1000, then…
The access request will fail because 1000<2000
In dynamic relocation, the OS maps the logical address of a process to a physical address by adding the base address of the process to the logical address. Therefore, in this case, the logical address of the process is 2000, and the base address is 3000, which gives a physical address of 5000.
A stack frame is created when…
a function is called.
In the context of file management systems, “random access” refers to
No need to read bytes 1 to n-1 before reading or writing byte n
How do you calculate the number of page frames in the physical memory?
To calculate the number of page frames in physical memory, we need to divide the total physical memory size (e.g. 2GB = 2^31 bytes) by the page size (e.g. 2^12 bytes) = 2^19 pages.
we need to round down the number of pages to the nearest power of 2 that is less than or equal to the number of page frames.
The largest power of 2 that is less than or equal to 2^19 is 2^18, so the number of page frames is:
2^18 = 262,144
Is the following statement true or false?
Using the Hoare implementation, a signal will immediately transfer ownership of the monitor to the one thread woken by the signal. Under the Mesa implementation, a signal wakes at least one waiting thread, which will then compete normally to obtain ownership of the monitor in competition with any other threads that are trying to obtain ownership of that monitor.
True
what implementation of monitors transfers ownership of the monitor immediately to the thread that is woken up by a signal.
Hoare implementation of monitors transfers ownership of the monitor immediately to the thread that is woken up by a signal. This means that the woken thread gains exclusive access to the monitor and continues executing its critical section without interference from other threads.
what implementation only wakes up one waiting thread when a signal is sent.
Mesa implementation only wakes up one waiting thread when a signal is sent.
The woken thread then competes with other threads that are waiting to acquire the monitor by using a lock or semaphore mechanism.
This means that the woken thread may not immediately gain access to the monitor and may have to wait until it acquires the lock or semaphore
The bounded buffer problem is also known as the _______________ problem.
Producer-Consumer.
The producers generate data items and add them to the buffer, while the consumers remove data items from the buffer and process them. The problem is to ensure that the producers do not add items to the buffer when it is full, and the consumers do not remove items from the buffer when it is empty.
When the method release() of the Java Semaphore class is called, it will:
sometimes unblock a waiting thread that called acquire().
What are the advantages of using the Synchronized keyword (i.e. Intrinsic Monitors) over the ReentrantLock class (i.e. Extrinsic Monitors)?
- Guarantees that ownership is released at the end of methods that invoke the mutual exclusion protection
- less error-prone.
advantages that Monitors have over Semaphores:
- Clear recommended programming pattern
- Has separate and clear syntax for mutual exclusion
- Programming less prone to errors
- No constructor arguments are required.
What kind of storage do we need
Need:
* Long-term storage
* Bigger size
* Non-volatile
* Easier sharing
* With hardware abstraction
- Read block k
- Write block k
Files
- Files are persistent logical units of
information - Created and manipulated by processes
- File (Management) System is the part of
the OS dedicated to file structures, names,
uses, etc. - Process, Address Space and Files are the
most important abstractions an OS
provides
Files Systems Timeline
Big names for end-user computers include:
* 1984: FAT-16 created by IBM and Microsoft and used
in MS-DOS, then in Window 95 and 98
* 1987: Minix V1 FS created by Andrew Tanenbaum for
Minix 1.0
* 1992: ext and ext2 (1993) created by Remy Card for
Linux
* 1993: NTFS 1.0 introduced in Windows NT 3.1
* 1996: FAT-32 introduced in Windows 98
* 1999: ext3
* 2006: ext4
File Extensions
Windows
* Extensions dictate the choice of the program
to open the file, e.g., jpg, pptx, etc.
Linux
* Less binding between extensions and
programs
File Types
Files types:
* Regular: your files, ASCII or binary
* Directories (folders): contain collections of
files
* “Character” special files: model serial I/O
devices, such as terminals, printers and
networks
* “Block” special files: model disks
Under Linux, use “ls -l“
Regular and Binary Files
ASCII files
* Lines terminated by a carriage return, line feed, or
both (Windows)
Binary files start with magic numbers that identify
their formats
* 50 4B 03 04 (PK??) for zip (and derivatives, e.g.,
JAR)
* 7F 45 4C 46 (.ELF) for ELF files
* 25 50 44 46 2d (%PDF-) guess!
* 4D 5A (MZ) Windows Executables, Mark Zbikowski
* https://en.wikipedia.org/wiki/List_of_file_signatures
OS must, by default, be able to parse its own
executable formats
Files Access
Sequential: access a file byte-by-byte or
record-by-record
* Inconvenience: if you want to read the nth
byte, then you must read everything before it!
* Can rewind a file, name from magnetic tape
age!
- Before disks
Random access: access any byte/record
out of order
Attributes (aka metadata) include:
Attributes (aka metadata) include:
* Protection information
* Creator
* Creation time
* Owner
* Hidden Flag
* Time of last access
* Time of last change
* Current size
File Attribute Manipulation in Linux
Check permissions using “ls –l some_path”
* -rwxrw-r– 1 user group 2 Jan 24 19:59 test.txt
* https://wiki.archlinux.org/index.php/File_permissions_and_attributes
- chmod [u/g/o/a][=/+/-][r/w/x] file
- chown user:group file
- lsattr and chattr to list and change attributes
Implementers focus on
- How files and folders are implemented
- How disk space is managed
- How to make to the file system efficient and
reliable - How to integrate the FS with the rest of the
OS
About (Magnetic) Disks
Units
A. Track
B. Geometrical sector
C. Track sector
D. Cluster
Block vs. Sector
* blockdev –getbsz /dev/sda
Sector’s size is:
512B on traditional HDD
* 4096B on newer ones
* Smallest storage unit
- SSDs simulate sectors
Allocation of Blocks to Files
Different ways to stores files in blocks
1. Contiguous allocation
2. Linked list allocation
3. Linked list with a Table
4. i-nodes
Contiguous Allocation Pros and Cons
Pros
* Easy to implement
* Fast to store a file
Cons
* Creates unused fragments of memory
- Used in CD-ROM
Linked List Allocation Pros and Cons
Pros
* No more empty fragments
Cons
* Takes longer to read all blocks of a file
* Slow random (vs. sequential) access
* You can easily end up with a disk full of
fragmented files
* “Pointer” bytes wasted
- Storing a block uses a bit more than a block
Linked List with a Table Description + Pros and Cons
Store pointers in a table
for faster access
* File Allocation Table,
implemented it in MS-DOS
Pros
* Faster access
Cons
* Table size grows linearly
with disk size
* 1TB disk with 1 KB blocks
has 1 billion entries
* For performance, have the
FAT in RAM
i-nodes
An i-node stores information about file
blocks + its properties
- An i-node stores a list of block addresses
of a file
- The OS loads in RAM i-nodes of active
files
* If each file has N blocks and there are K
active files, then only N*K addresses loaded
But N is not a constant * Use the last pointer to
point to another i-node
Used in
* Unix File System
- A Unix directory is a list of pairs of filenames and
i- node numbers
* ext- ext4
- ls-i to display the i-node number of a file
- “struct inode” in linux/fs.h
* NTFS
Magnetic Disks
A collection of platters
Each platter is covered with magnetic
layer
* On both faces
Heads can be positioned on tracks
Disk Scheduling on HDD
Different in SSD!
Disks merge requests from all processes in one
queue
Performance Criteria: time taken to get data
1. Access time: time to get to the first byte
Seek time: taken to move the arm to the right cylinder
Rotational latency: needed to get the right sector under the
head
2. Disk bandwidth: number of transferred bytes divided
by the time between the first request and last byte
transfer
Seek time is usually the longest, will focus on it
Disk Scheduling on HDD: An Example
Scenario (Tanenbaum)
* Range of cylinders: 0-39
* Head starting position: cylinder 11
* Request queue: 1, 36, 16, 34, 9, 12
Disks can store pending requests to
cylinders as separate linked lists
* Run requests of a given cylinder in one go
First Come First Served
Process request in the same order they were
received in
Arm moved by (11,1, 36, 16, 34, 9, 12):
* 11-> 1: 10
* 1-> 36: 35
* 36-> 16: 20
* 16-> 34: 18
* 34-> 9: 25
* 9-> 12: 3
* Total: 111 arm moves
Shortest Seek Time First
Instead of making non-optimal arm moves
by going to the next received request
Move to the closest requested cylinder
Shortest Seek Time First
Arm moved by (11,1, 36, 16, 34, 9, 12):
* 11-> 12: 1
* 12-> 9: 3
* 9->16: 7
* 16->1: 15
* 1->34: 33
* 34->36: 2
* Total: 61
Shortest Seek Time First
Problem with SSF
* Say, during processing of request for 16 (1
waiting)
- Request comes for 8: 8 runs, 1 waits
- Request comes for 13: 13 runs, 1 waits
- Etc. and 1 waits …
* On a heavily loaded disk, the arm will tend to
be in the middle and requests on the
extremities will starve
Similar issue with elevators of tall
buildings!
Elevator Algorithm
- Maintain 1 bit that gives the direction
- Find a request in the current direction
- Direction is reversed when no more
requests left in the current direction - Say, the bit is initially set to UP …
Elevator Algorithm Arm moved by (11,1, 36, 16, 34, 9, 12):
Arm moved by (11,1, 36, 16, 34, 9, 12):
* 11-> 12: 1
* 12-> 16: 4
* 16-> 34: 18
* 34->36: 2
* 36->9: 27
* 9->1: 8
* Total: 60
Elevator Algorithm Properties
Properties
* Usually slower than SSF, but covers the entire
disk
* Upper bound for covering all requests: twice the
size of the entire range
Circular Elevator Algorithm
* Always in the same direction
* Once no more requests in the chosen direction,
start over again
* 11, 12, 16, 34, 36, 1, 9
- instead of 9, 1 in Elevator
Review of Disk Scheduling Algorithms
On a lightly loaded hard-drive, all
algorithms “converge to” FCFS!
Elevator and derivatives perform better on
heavily loaded hard-drives
File allocation has its word in the
performance of file access
Master Boot Record (MBR)
Reminder: disks are divided into sectors
Sector Zero of a disk is the MBR
* Used to boot the computer
The partition table is at the end of the
MBR
* Give the type, start and end of each partition
* One of the partitions is marked as active
partition
Boot process
When a machine is booted
* Executes the MBR
* The MBR program locates the active partition
* MBR program reads the first block of the
partition
Aka the Boot Block
* The Boot Block program loads the OS from
the partition
Programmer’s Wishlist
All programmers dream of a memory that
is:
* Private: their program’s memory is
inaccessible to other processes
* Infinitely large
* Infinitely fast
* Nonvolatile: does not lose content when
powered down
* Inexpensive?!
Memory Manager
Memory hierarchy: a sequence of layers of
memory chunks, the first is directly connected to
the CPU
When you move away from CPU, the memory is
* Slower, Larger and Cheaper
The memory manager is heavily involved in the
management of this hierarchy
* Allocates memory to processes, knows free memory,
etc.
Raw Memory Access: No Memory Abstraction
The simplest scenario: processes directly
access physical memory
An instruction like
* MOV R1, 4th_byte
* accesses the actual 4th byte of memory
Cannot have 2 programs in RAM at the
same time: what if both access Byte 4?
Three possible implementations (+minor
variations)
Raw Memory Access
One process in memory at a time
* (a) in original mainframes, rarely used now
* (b) the OS in Read-Only-Memory; embedded
and handheld devices
* (c) early personal computers used ROM for
the BIOS
Limitations Raw Memory Access
Only one process can run at a time
* Still can swap processes between RAM and
disk
Processes can access OS memory in
scenarios (a) and (c) and might crash it
Obsolete for computers, but still in use for
very simple systems
* Your washing machine, fridge, kitchen
machine, etc.
Raw Memory Access with Many Processes
Program Status Word: Early IBM 360
* Can swap processes between RAM and disk
* Split memory in 2KB blocks, each has a 4-bit
protection key
* Each process has a Program Status Word
* A process can access a chunk if its PSW
matches the protection key
* Only the OS can change the PSW
Program Status Word Limitations
Code in both programs refer to the same
set of addresses
* JMP 28, in process 2 (white), fails in (c)
Static Relocation
Code in both programs
refer to the same set of
addresses
Solution: Static
Relocation
* Process P2 (white) is loaded at address 16384
* Each reference to an @ by 2 is incremented by16384
Static Relocation Limitations
The OS must set the 4-bit key of each
block of 2KB of the process memory
* Time-consuming
In an instruction like MOV R1, 400 you
need to know if 400 is an address
Need some form of memory abstraction
* Access to abstract addresses rather than real
ones
Solving the Protection and Relocation Problems
The use of keys and PSW addresses the
protection problem
* But not the relocation one
IBM’s static relocation addresses the
relocation problem
* But slow
Both non-scalable
Address Space
An address space is an abstraction of the
physical memory
* The set of all memory addresses that a
process can access
Numbered in a different way to the actual physical
addresses being used
* Identical values of addresses from different
processes refer to different physical locations
Address 400 of P1 could be 0x10640
Address 400 of P2 could be 0x20880
Dynamic Relocation
Used in CDC 6600 andIntel 8088
Uses base and limit registers
A process’s base address is where it was loaded
The limit gives the size of the program
* Access to an @>limit generates a fault
For each memory access, add the base’s value
Dynamic Relocation Limitations
Intel 8080 did not provide a limit register
Cost
* Addition of @s to base, a bit of a problem
- Propagation of the carry bit
* Comparison to limit, OK
- What if two programs, combined, don’t fit in memory?
* Two approaches: swapping and virtual
memory
Swapping
Swapping simply consists in backing up a process
(or more) on disk if the next process to load
doesn’t fit in memory
* Note: the whole process memory is copied
Swapping Limitations: Memory holes
Swapping creates “holes” in memory:
small ranges of unused memory
Solution? Compact memory by moving all used slots up or down to remove the holes
* BUT: this is quite time-consuming
- If you have 16GB of RAM that copies 8B in 8ns, it
could take 16s
Swapping Limitations: Process Size
Processes can grow during execution
* Stack: for every function call (a return frees memory)
* Heap: for each memory allocation instruction
OS might already provide room for growth
If the process exceeds the max size, then needs to be relocated
Another Limitation: Bloatware
Virtual Memory Context
A process might not fit in memory
But also, a collection of processes might not fit in memory
* Swapping entire processes is resource-consuming
* SATA disks transfer rates of 100s MBps, takes
seconds to swap out a process of 1GB and
similar time to swap in a similar process
Overlays
Solution from the 60s
Slice a program into multiple overlays
Only load the overlay manager (OM)
* The overlay manager loads overlay 0
* Once done, asks the OM to load overlay 1
Overlays kept on disk, swapped in on
demand
Slicing a program is done by its creator
* A nightmare! Error-prone
Virtual Memory and Memory Pages
Needed to make the OS slice programs
Virtual Memory
* Each process has an address space
* The @ space is cut into chunks, aka memory pages, by the OS
* A memory page is mapped onto a contiguous
range of physical addresses
E.g., range 20480000-20484095 of virtual mem. maps
onto 10240000-10244095 of physical mem.
* When a process references a virtual address, its corresponding physical page is brought from the desk, if not already loaded in RAM
What is Memory Management Unit (MMU)
Does the translation of virtual addresses
into physical addresses
E.g., translates virtual @1 into physical @
1024000001
MMU Operations: Example
Consider a machine with
* 32KB physical memory
* 4KB pages
* Uses virtual memory to simulate a memory of 64KB
A page fits in a page frame
* 16 virtual pages
* 8 physical page frames
Sizes of memory addresses?
* 16 bit virtual, 15 bit physical
MMU Operations: Page Fault
The OS keeps track of which pages are
loaded in RAM
* A present/absent bit
If referenced address is in a page not present in RAM, the CPU traps to the OS
1. Page fault trap
2. The OS chooses a page from RAM and evicts it from its page frame
- Backs it up to disk if its content has changed
3. Brings the requested page from the disk
Working Set and Thrashing
Pages currently in RAM are the process’ working set
Thrashing when a process keeps causing page faults
Working Set and Thrashing
Pages currently in RAM are the process’ working set
Thrashing when a process keeps causing page faults
Page Size
Hardware comes with its pre-defined page size, e.g., 4096B (4KB)
The OS can define its software page size to, say, 2 hardware pages
On average, half of the last page will be unused
* Internal fragmentation
* This pushes to use small page sizes
* But very small pages results in higher management cos
Paging as a Performance Bottleneck
Page information stored in a page table
* The number of a virtual page is used as index
* If the present/absent bit is 1
The cell contains the page location in RAM
* Else, not loaded
Need to access the page table for each memory reference
Worse, need to access the page table in order to read next instruction to execute
Big decision: where do we store the page
table?
* The table can be large
- 4KB pages, 32-bit address => 1 million pages
- One-page table per process!
- Naive approaches
* Store the page table in main memory
* Load the whole page table of a process into the MMU
Translation Lookaside Buffer (TLB)
Most programs access a small number of pages most of the time
Add Translation Lookaside Buffer (TLB) to MMU
* Part of paging hardware
Stores frequently accessed frames
* Acts as a cache for the page table
* Rarely more than 256 entries
If referenced memory address is in TLB, then no need to access the page table
If not, then fetch the page entry from the page table and load it into the TLB
Valid bit indicates whether page is in use or not
Effective Access Time
Effective access time assumes
* The requested virtual page is in RAM
Given
* It takes 30 nanoseconds to search the TLB
* It takes 100 nanoseconds to access memory
* An 80%t hit ratio, i.e., desired page number in the TLB 80% of the time
Finding the Effective Access Time for a memory address
* If the page number is in the TLB: 30ns TLB + 100ns memory
* Otherwise, not in the TLB (30ns)
Search memory for the page table and frame number (+100ns)
Search the desired byte in memory (100ns)
230 nanoseconds in total
* effective access time = 0.80 x 130 + 0.20 x 230 = 150ns
Segments
Each component of the program can be given a segment
A segment is a contiguous set of addresses
Segments sizes might be different
Page Replacement Algorithms
Required to decide on which page to evict
* Should not evict a page that is going to be used soon
If a page has been modified, it must be written back to disk
Optimal Replacement Algorithms
Estimate the time it will take before you need to
access each page
Order the pages in the ascending order of
* E.g., page 3 will be used next, then page 17, then page 23, …, then page 5
Evict the last page
* E.g., page 5
Pick the one which will not be used before the longest
time
Not possible unless we know when pages will be referenced (crystal ball)
Used as ideal reference algorithm
Not Recently Used
Uses two bits:
* R: recently used
* M: modified
Periodically, set R to Zero
A page belongs to one of four classes
* Class 0: not referenced, not modified
* Class 1: not referenced, modified
* Class 2: referenced, not modified
* Class 3: referenced, modified
Evict a page from the lowest class
First Come, First Served (First-In, First-Out)
Maintains a list ordered by ascending arrival time
Evict the head of the list
Problem
* If a page in the list is used again, it does not
change its position
* The oldest might be still in heavy use
Second Chance Algorithm
Similar to FIFO, but checks the R bit
If 1
* Put the page at the end of the list, set R to 0
If 0
* Evict
Least Recently Used (LRU)
“Least” not as in a counter
* Not “Least Frequently Used” (LFU)
Evict the page that has not been used for the longest time
Maintain a list of pages according to their
recency of usage
* From most recent (head) to least recent (tail)
Resource consuming, will potentially need
to frequently move pages in the queue
Process ID (PID)
A unique identifier for each process
Parent Process ID (PPID)
The PID of the parent process
Process State
The current state of the process (e.g., running, blocked, suspended, etc.).
Process Priority: The priority level assigned to the process by the operating system.
Process Priority
The priority level is assigned to the process by the operating system.
Process Resources
The resources allocated to the process by the operating system (e.g., memory, CPU time, I/O devices).
Thread ID (TID)
A unique identifier for each thread within a process.
Thread State
The current state of the thread (e.g., running, blocked, suspended, etc.).
Register Set
The set of registers used by the thread for storing its execution context.
Stack Pointer (SP)
The memory address pointing to the current top of the thread’s stack.
Thread-specific Data
Data structures specific to a particular thread, such as thread-local storage or thread-specific variables.
Blocked (B): (state diagram of a process)
The process is waiting for a resource (e.g., I/O) to become available, and it cannot continue until the resource is ready.
Running (R): (state diagram of a process)
The process is currently running on the CPU.
Ready (R) : (state diagram of a process)
The process is waiting for the CPU to become available, and it is ready to execute.
Blocked to Ready state of a process
A process that was waiting for an I/O operation or another event to complete becomes ready to run again once the event has finished or the resource becomes available.
Ready to Running state of a process
A process in the ready queue is selected by the scheduler to run on the CPU.
Running to Blocked state of a process
A process that needs to perform an I/O operation or another event that cannot be completed immediately must wait for the event to complete, and therefore, it becomes blocked.
Running to Ready state of a process
A process that is preempted by the scheduler or finishes its CPU burst returns to the ready queue, waiting for another opportunity to run on the CPU.
First-Come, First-Served (FCFS)
The operating system schedules processes in the order they arrive, with the first process in the queue being the first to receive CPU time.
Round Robin (RR):
The operating system allocates a fixed time slice or quantum of CPU time to each process in turn, with processes being rotated through the CPU based on their arrival time and the length of their CPU bursts.
Priority-based scheduling:
The operating system assigns each process a priority level based on its characteristics, such as its level of interactivity or its importance to the system, and schedules processes with higher priorities before those with lower priorities.
What is a race condition?
A race condition is a situation that occurs when two or more processes or threads access a shared resource or data concurrently and try to modify it at the same time, leading to unpredictable or incorrect behaviour.
Proper synchronization techniques and mutual exclusion mechanisms such as locks, semaphores, or monitors can be used to prevent race conditions and ensure safe access to shared resources.
What is a critical region
A critical region is a section of code or a segment of a program where shared resources, such as variables, data structures, or I/O devices, are accessed and modified by multiple processes or threads concurrently.
Since the execution of the critical region by multiple processes or threads can lead to race conditions and inconsistencies, it is necessary to ensure mutual exclusion and synchronization of access to the shared resources.
To achieve this, critical regions are typically protected by synchronization mechanisms such as locks, semaphores, or monitors that allow only one process or thread at a time to access the critical region and ensure that the modifications to the shared resources are made atomically and in a consistent order.
The use of critical regions and synchronization mechanisms is essential to prevent race conditions, ensure data integrity and consistency, and avoid conflicts and deadlocks in concurrent software systems.
What is TSL
The TSL (Test and Set Lock) instruction is a low-level synchronization primitive that is used to implement mutual exclusion and critical sections in concurrent software systems.
The TSL instruction typically takes two arguments: a memory location or a variable, and a register. The instruction atomically tests the value of the memory location or variable, and if it is zero, sets it to a non-zero value and stores the old value in the register.
The TSL instruction guarantees that the entire operation is executed atomically and is not interrupted by other threads or processes.
However, the TSL instruction is a low-level primitive and can be prone to several issues such as busy waiting, starvation, or deadlocks if not used correctly.
Therefore, higher-level synchronization primitives such as locks, semaphores, or monitors are typically used instead of the TSL instruction in modern concurrent programming.
Explain the abstract idea of semaphores.
Semaphores are an abstract synchronization primitive used to control access to shared resources in a concurrent system.
They provide two basic operations: wait and signal, which can be used to coordinate the execution of multiple threads or processes. Semaphores are a fundamental building block for many higher-level synchronization mechanisms such as locks, monitors, and condition variables.
How can you use a semaphore to implement mutual exclusion?
Mutual exclusion can be implemented using a semaphore, also known as a mutex.
The mutex semaphore is initialized to 1, indicating that the critical section is initially free.
When a thread wants to enter the critical section, it first executes a wait operation on the mutex semaphore.
If the value of the semaphore is 1, indicating that the critical section is free, the thread decrements the value of the semaphore to 0 and enters the critical section.
If the value of the semaphore is 0, indicating that the critical section is already being used by another thread, the wait operation blocks the calling thread until the value of the semaphore becomes 1.
When the thread leaves the critical section, it executes a signal operation on the mutex semaphore, which increments the value of the semaphore to 1 and wakes up any waiting threads. This ensures that only one thread can be in the critical section at any given time, providing mutual exclusion.
Concurrency definition
Concurrency refers to the property of a system in which multiple independent tasks or processes can make progress simultaneously, potentially overlapping in time.
Parallelism Definition
Parallelism refers to the execution of multiple tasks or processes simultaneously using multiple computing resources, such as multiple CPUs, cores, or nodes.
Give an example of concurrency that is not parallelism
One example of concurrency that is not parallelism is a single-threaded web server handling multiple requests from different clients.
While the server can only execute one request at a time, it can handle multiple requests concurrently by switching between them in rapid succession.
While the worker thread is busy, the server can switch back to the main thread and handle other incoming requests in a similar manner. This way, the server can handle multiple requests concurrently, even though it is not using multiple processors or cores.
Explain how Latency and Throughput are related on a single core
system and how pipelining (using more than one core) can affect
that relation.
On a single-core system, latency and throughput are often inversely related: reducing latency usually comes at the cost of reducing throughput, and vice versa.
This is because increasing the speed at which tasks are executed (to reduce latency) often requires more resources, which can reduce the number of tasks that can be executed in parallel (thus reducing throughput).
Pipelining (using more than one core) can help to mitigate this by allowing multiple tasks to be executed simultaneously, potentially increasing both throughput and reducing latency.
By splitting tasks into smaller sub-tasks and assigning them to different cores, pipelining can increase the total amount of work that can be done in a given period of time (throughput) while also reducing the amount of time it takes for a task to be completed (latency).
Consider a stream between two processes: a producer process P
and a consumer process C.
(What UNIX construct can be used to implement this?
The Unix construct that can be used to implement a stream between two processes is a pipe.
A pipe is a communication mechanism that allows the output of one process (the producer) to be connected to the input of another process (the consumer) so that data can be transferred between them. The pipe behaves like a unidirectional stream, with data flowing from the producer to the consumer.
Can your pipe approach be used if more than one producer /
the consumer is involved?
If so, how? If not, why not?
Yes, the pipe approach can be used when more than one producer and/or consumer is involved.
Multiple producers can write data to the same pipe, which can be read by a single consumer. The consumer can use techniques such as polling or blocking to read data from the pipe as it becomes available.
Similarly, multiple consumers can read data from the same pipe, which is produced by a single producer. The producer can write data to the pipe as it becomes available, and the consumers can use techniques such as polling or blocking to read data from the pipe as it arrives.
A stream between threads can be implemented in a lock-free manner if the following conditions are met:
- The data is transferred in fixed-size chunks, with a well-defined boundary between each chunk.
- The producer and consumer threads operate independently, without requiring any synchronization or coordination between them.
- The producer thread writes data to a shared buffer, and the consumer thread reads data from the same buffer.
- The buffer is implemented using lock-free data structures, such as circular buffers or linked lists. This allows the producer and consumer threads to access the buffer without requiring any locks or other synchronization primitives.
A program counter is:
A register used to store the address of the latest instruction run for a program. Each program has a program counter.
A binary file can be defined as:
A file that uses 8 bits out of each byte
The address space of a process refers to the space it uses in the:
RAM
In the context of file management systems, “random access” refers to:
No need to read bytes 1 to n-1 before reading or writing byte 1
A stack frame is created
when a function is called
What makes a text file executable
chmod a+x file
For which systems is non-preemptive scheduling more suitable?
Batch
In an OS that implements paging, what is true about page frame size
The size of the page frame has to be identical to the size of a page frame
A page fault occurs when
A requested virtual memory page is not present in RAM
“On average, only 50% of each memory page will contain useful data.”
False, only the last page of a process will, on average, contain 50% of useful data.
A process references memory address 2000. If it is running on an OS that uses dynamic relocation where the base address of that process is 3000 and its limit is 1000, then:
The access request will fail because 1000<2000
Which of the following statements is correct:
1- SJF is suitable for interactive operating systems
2- SRT is preemptive
3- Both preemptive and non-preemptive variants of priority scheduling are suitable for a
smartphone OS
2- SRT is preemptive
Latency Definition
Latency refers to the amount of time it takes for a single task to complete. It is the delay between the initiation of a request and the time when the response is received. It is usually measured in units of time, such as seconds, milliseconds, or microseconds.
Threads vs Processes
Threads
– Abstracts & hides CPUs
– OS shares CPUs over Threads
– Hence schedules threads to CPUs
Process
– Abstracts & hides computer resources (except CPUs)
– Shares computer resources (RAM, Disc and I/O) over processes
What is the Context of a Thread?
Thread Context
= all info needed to restore a Thread to run in a CPU
e.g.
program counter, stack pointer, thread state,
thread register values, pointer to thread code,
pointer to parent process etc.
What causes Context Switches? (Non pre-emptive Scheduler)
- Running thread calls an OS service e.g.
- I/O service
- Synchronisation primitive
- Delay function
- etc.
What causes Context Switches? pre-emptive Scheduler
Pre-emptive Scheduler
* All non pre-emptive reasons plus
interrupts e.g.
* Timer interrupt for time slicing
* I/O interrupt (e.g. disc or comms
controller has finished DMA, GUI
button pressed etc.)
* Software exception
“OS” as used in the Concurrency part of
F29OC
The code that is responsible for sharing the CPU(s) between threads.
Note that this can be:
a) Part of a standard Operating System (e.g. Linux).
b) Part of a ‘program’ e.g. part of Java’s JVM (executes in user space).
c) A combination of (a) and (b).
How does the “OS” execute on multiple CPUs?
The OS will most likely run on the last interrupted CPU or
where a Thread calls an OS service
Can a Thread execute on multiple CPUs?
Note that the OS can switch threads across CPUs (this will be dependent upon the scheduling algorithm)
* But also note that a thread will execute on multiple CPUs at the same time
Advantages of Multithreaded Systems
- Speed of execution on multicore systems (inc. logical cores)
- Responsiveness
- Ease of programming
Disadvantages of Multithreaded Systems
Disadvantages
* You need to understand and program in a multithreaded
paradigm
Processes vs Threads (Mikes Explanation)
In a multithreaded OS:
* Threads used to share CPU(s)
* Context Switching between Threads is generally fast
* Context and status stored in TCB
* Processes used to share other resources
(e.g. memory, I/O systems etc.)
- Processes can have many threads
- Generally threads can access all a parent processes resources
- Switching between processes generally much slower than a Thread
Context Switch. - Process info. stored in its PCB
(process control block)
Definition of Race Condition (Mike)
where the answer depends upon the scheduling of the
threads!!!!
(code is not “thread-safe”)
Solution: Protect with Critical Region
Critical Region (mike)
“Critical Region” = section of code that can only
be executed by one thread at a time
Multiple Critical regions
A thread that has control (ownership) of one critical region prevents access to all critical regions protected by that lock.
Multiple locks
You can use multiple locks to protect multiple critical regions
* A thread that has control (ownership) of one lock does not prevent other threads from gaining control of other locks and their critical
regions
What is a ‘Shared Resource’ or ‘Shared State’?
Any mutable object that can accessed by more than one
thread
How Critical Regions prevent races
Prevent concurrent access to shared objects and
variables
Critical Region (Mike)
“Critical Region” = section of code that can only
be executed by one thread at a time
Java Extrinsic Monitors
– Use for Mutual Exclusion
– Use for thread synchronization
Classic Monitor Model
– MX Properties
Monitor (instances) contain a set of mutually exclusive methods. They’re similar but more flexible than critical regions.
Thus once a thread has entered a monitor (i.e. started to execute any of its methods) it is said to have ownership of the monitor.
Once it finishes executing a method – then it releases ownership.
If a thread has ownership then all other threads will be blocked when they attempt to enter the monitor.
Intrinsic Monitors
Use synchronized key word and a single implicit condition
variable per synchronised object
Extrinsic Monitors
Uses explicit locks (e.g. ReentrantLock) and explicit condition
variables that are properties of their locks
Use for Synchronisation
We can declare and use Condition variables
with locks in order to synchronise the operation
of threads
Monitor Terminology (Ownership, Signaller, Waiter)
- Ownership: the thread that has MX control of the monitor
- Signaller: a thread that calls lock.signal()
- Waiter: a thread that calls lock.await()Monitor Terminology
Interface Condition
The key property that waiting for a condition provides is that it atomically releases the associated lock and suspends the current thread.
Hoare (signaller yields) (Mike)
- OS selects (say) t1 from cv1 queue
- Signaller releases ownership to waiter (t1) and t1 gains ownership and starts running
- Note that if the signal is not at the end of the signaller’s (t3’s) method call, then ownership will revert back to the signaller (t3) at some stage (but after t1 releases ownership) so that it (t3) can finish execution of its method
– Gotcha: signaller releases MX control
and must then regain it
Mesa (signaller continues) (Mike)
- OS selects (say) t1 from cv1 wait queue
- waiter (t1) added to MX (ownership) queue and signaller (t3) continues.
- Signaller (t3) exits monitor method and releases ownership
- Next thread queued on ownership Q (say t1) moved to ready Q …
Why Use a Buffer?
- Decouple Timing of Consumer and Producer
- Act as Collector for Multiple Producers
- Act as Distributor for Multiple Consumers
Why Use a Buffer?
- Decouple Timing of Consumer and Producer
- Act as Collector for Multiple Producers
- Act as Distributor for Multiple Consumers
Shared State?
- Pass by value to thread = not part of shared state
- Passed by reference to thread = Part of shared state
Thread States in Busy Wait
Always Runnable – in either ReadyToRun or
Running (on a CPU)
- Can use lots of processor cycles and make
system unresponsive!
For safety can add sleep into loop
* Prevents processor hogging but increases
average latency
What OS-provided software mechanisms can be used to achieve
communication between two processes? Name 3 different
concepts and explain in one sentence each what they provide
and how that can be used for communication.
Three OS-provided software mechanisms for achieving communication between two processes are:
Pipes: A pipe is a unidirectional communication channel that allows the output of one process to be connected to the input of another process. Pipes are typically used for simple, sequential communication between two processes.
Sockets: Sockets are used for communication between processes over a network. They provide a standardized interface for communication, allowing processes to communicate with each other regardless of their location or underlying hardware.
Shared memory: Shared memory allows multiple processes to access the same physical memory space, allowing for efficient communication between them. This can be used for high-performance, low-latency communication between processes that require frequent exchange of data.