Virtual Memory (10) Flashcards
Virtual Memory
- Virtual Memory: separation of logical memory from physical mem.
- The whole program doesn’t need to be in memory to be executed
- Logical address space can be in much larger than physical
- Allows address spaces to be shared by several processes
- Library code can be shared
- More concurrency
- More efficient process creation
- Less stuff to load in memory
- Less I/O needed to load or swap process
- Only parts of the process that are not yet in me
Virtual memory can be implemented with:
- Demand Paging
- Demand Segmentation

Virtual-Address Space
Logical view of how process is stored in main memory
- Usally starts at 0
- Physical memory is organized in page frames
- MMU must map logical to physical
Often designed this way:
- stack starts at the top of the logical address space
- heap grows up above data and code
- Unused address space between: hole
- No more physical memory needed until there is a hole
Enable sparse addresses spaces with holes left for:
- growth
- dynamically linked libraries
System libraries and shared memory can be accessed by mapping them into virtual address space.
Pages can be shared to speed up fork()

Demand Paging
- Doesn’t bring whole process into memory, only loads the needed page (Lazy swapper)
- Less I/O, no unnecessary I/O
- More free memory
- faster response
- Can serve more users
- Hardware support needed
- Needs to change MMU and add a bit signalling validity
- So that during address translation it can detect page faults
- v = in-memory
- i = not-in-memory -> page-fault
- So that during address translation it can detect page faults
- secondary memory (swap)
- instruction restart
- pay attention to limit cases: not all instruction can be restarted after one page fault, some may generate more than one.
- Needs to change MMU and add a bit signalling validity
- Extreme case: start process with no pages in mem. (this for every proc.)
- pure demand paging
- Instruction could access multiple pages -> more page faults
- locality reduces the stress
Handling a Page Fault (Demand Paging)
- Once in page fault, OS looks at another table to decide
- Invalid ref. -> abort
- not-in-memory
- find free frame
- swap page into frame by scheduling disk op.
- reset tables to reflect that the page is in-memory
- restart from instruction that caused page fault

Free-frame list
- Most OSes maintain a list pool of free frams to satisfy requests
- Zero-fill-on-demand: the content of the frames are zeroed-out before being allocated
- When a system starts up: all the memory is in the free frame list
- It’s very important to always have free frames
- if not we need to deal with freeing also a frame -> 2x slower

Demand paging performance
Main factors:
- Handling the interrupt: several hundred instruction if we are being optimistic (not so long)
- Read the page: very very slow
- Restart the process: small amount of time
p = Page fault rate
Effective access time = (1-p) * memory access + p ( page_fault_overhead + swap_out_time + swap_in_time)
Demand paging optimization
- Copy entire proccess image to swap space at process load time, then page in and out from there
- swap space I/O is faster than fs I/O even if it is same device
- swap is allocated in larger chunks -> less overhead
- used in older BSD Unix v.
- swap space I/O is faster than fs I/O even if it is same device
- Demand page in from program binary on disk
- this is very good because:
- most pages are read-only
- it doesn’t require to write back to disk
- We still need write to swap space
- heap and stack
- pages modified in memory but not written back to the fs
- this is very good because:
Copy-on-write
- parent and child process share the same page initially
- only if a process modifies the page it gets another copy
- process creation is more efficient
- vfork() is used to let the child use parent address space while parent suspended
- designed for child calling exec
- very efficient
*
Page replacement
- What if no free frames?
- We need to prevent over-allocation of mem. by modifying page-fault service routine to include replacement
- Use dirty bit to transfer only pages that have been modified
- Page replacement makes possible separation between physical and logical memory
- we can have logical > physical
- Process:
- no free frame?
- run page repalcement alog to select victim
- write victim to disk
- bring page into the new free frame
- update page/frames tables
- continue
- With page repl. we have 2 page transfer for page fault

Page frame Replacement Algorithms
- Frame-allocation algo:
- how many frames do I give to this process?
- which frames to I replace
- Page-replacement algo:
- want lowest possible page fault rate of access and re-access
Algos:
- FIFO:
- Belady’s anomaly: adding more frames may result in more faults
- Optimal algo:
- used as a benchmark
- impossible to implement without discovering that P=NP
- LRU: least recently used
LRU and optimal are cases of stack algo without Belady’s anomaly.

LRU Algorithm Implementation
- Counter
- every page has a counter
- on every reference of the page: copy the timestamp into it
- look for the smaller counter when you want to replace page
- Stack
- keep stack of page numbers in a double linked list
- page referenced:
- move to top
- requires 6 pointers to be changes
- each update more expensive
LRU Approximation algorithms
We need an approximation because LRU needs special hardware and it is still slow.
Reference bit:
- Used to understand if a page was used recently enough.
- RB = 1 -> page is fresh
- RB = 0 -> you can replace it
Second-chance algorithm:
- FIFO + hardware provided reference bit
- Circular list
- if page to be replaced has RB = 0 then replace
- otherwise set RB = 0, leave page in memory, move to next page
- repeat.
Basically we are setting to old fresh pages, but replacing only the ones that were already old.

Enhanced second chance algorithm
- Use reference bit and modify bit
- consider pair (rb, mb)
- (0, 0): not used recently neither modified
- best candidate
- (0, 1): modified but not used recently
- okay, but must write to memory
- (1, 0): used recently but not modified
- probably will be used soon
- (1, 1): recently used and modified
- will be used soon and need to write it
- (0, 0): not used recently neither modified
- On page replacement:
- use clock scheme
- use the four classes to replace page
- might need to search queue several times
Counting algorithms
- Keep counter of the references amde to each page
- Least frequently used (LFU): replace page with smalles count
- Most frequntly used(MFU): replace page with largest count, hypotesis that the page with lowest count was just brought in
- Not so popular
Page Buffering Algorithms
- We want to minimize the number of page faults
- We want always a pool of free frames in which to swap pages:
- We’ll have the frame when needed, not looked after the page fault occurs
- select victim to evict and add to free pool
- use dirty bit
- keep frame contents intact until eviction
- if page referenced again we don’t need to transfer page from memory
- reducec penalty if wrong victim was selected
Applications and Page Replacement
- Some applications have better knowledge than OS about what page will be needed
- Memory intensive apps can cause double buffering
- OS ees page in memory as I/O buffer
- Application does its work on their page
- Raw disk is a way for application to access disk directly bypassing
- buffering
- locking
- etc.
Frame allocation
- Each process needs a minimum number of frames
- Two main schemes:
- fixed allocation
- Equal: divide the number of frames by the number of processes
- keep some in free frames pool
- Proportional:
- allocate pages based on processes size
- changes with time due to degree of multiprogramming and process size
- Equal: divide the number of frames by the number of processes
- priority allocation
- many variations
- Global vs. Local replacement:
- Global: process can take a frame from another
- greater throughput, more common
- Local: select only from own set of frames
- underutilized memory
- more consistent per-process performance
*
- Global: process can take a frame from another
- Global vs. Local replacement:
- fixed allocation
Frame allocation: Reclaiming Page
- Strategy for global page-replacement policy:
- All memory requests are satisfied from free frame list
- if free pages number is under a certain threshold
- free up some frames so that we are above threshold

Non-uniform memory acccess
- frame allocation problem
- Many systems have a varying access speed to memory
- Optimal performance comes from:
- allocating frames near the CPU that is executing the process
- modifying scheduler to schedule thread on the same system board when possible

Thrashing
- If process doesn’t have enoguh pages, the page-fault rate gets high
- page fault to get page
- replace existing frame
- slightly after replace frame back again
- Leads to:
- Low CPU usage
- OS thinks: I need process that doesn’t do much I/O, let’s bring some processes to memory
- another process in the system -> more page faults -> less CPU utilization
Demand paging works because of locality model.
Thrashing occurs because: Σ size of locality > total memory size
We can limit thrashing by using local or priority page replacement.

Working-Set Model
- Trying to solve thrashing
- Δ ≡ working-set window ≡ a fixed number of page references
- WSSi (Working set of Pi): total number of pages referenced in the last Δ
- Δ too small: will not take full advantage of locality
- Δ too large: will include several localities
- Δ tends to infinity: will take the whole program
- D = Σ WSSi = total demand frames ≈ locality
- D > m => Thrashing
- Policy:
- if D > m: suspend or swap out one of the processes

Keeping track of working set
- Keeping track of the WSS at each page fault is too expensive
- we will have an interrupt and will need to compute the working set again
- Solution: approximate with timer interval, reference bit and previous reference bit state
- Example
- Δ = 10000
- Interrupt timer every Δ/2 time units
- at every interrupt copy and sets the values of all reference bits to 0 and update previous reference bit
- if one of the two bit is 1 => the page has been accessed in the last Δ time units
- We could improve using more bits and interrupting more frequently

Page-fault frequency
- More direct approch to solve thrashing tan WSS
- Establishes an acceptable page-fault frequency (PFF)
- if current rate high -> process gains frame
- if too low -> process loses frame
- Algo:
- at each page fault
- τ: time interval from previous page fault
- τ < c: page-fault freq. too high, add new frame to resident set of process
- τ >= c: freq. is okay:
- remove from resident set of process all pages with reference bit = 0
- clear reference bit of all other pages in resident set

Kernel memory allocation
- Treated differently (say what?!)
- Often allocated from free pages pool
- Kernel requests memory for structures of varying sizes
- Some kernel memory needs to be contiguos (for device I/O)


