Final Flashcards

Question

What flags exist for a page table entry?

Answer 1

* P-resent (valid/invalid) * D-irty (written to) * A-ccessed (whether its been accessed for read or write) * protection bits -> r, w, etc

Answer 2

* 32 bit system * Page Table Entry (PTE) = 4 bytes (32 bit) * Number of pages = 2^32 / Page Size * A common page size is 4kB * (2^32 / 2^12) * 4B = 4MB We don't use a flat page table anymore, though.

Answer 3

* Outer Level: Page Table Directory * elements are pointers to page tables * Internal page tables exist only for valid virtual memory regions * Now Virtual address looks like: [outer page table index | internal page table index | offset] * Could have additional layers using the same philosophy * on malloc, new internal page table may be allocated P3L2: 8 Multi Level Page Tables

Answer 4

+ Reduced page table size + Likely more 'gaps' that are the same size as the page table - More lookups required for the translation process since there are more levels

Answer 5

P3L2: 11 Inverted Page Tables

Answer 6

P3L2: 12 Segmentation

Answer 7

+ Fewer page tables entries, smaller page tables, more TLB hits - internal fragmentation => wastes memory * page size is calculated as 2^[# offset bits]

Answer 8

* external fragmentation * enough 'free' pages exist, but they are not contiguous so they can't be allocated * When possible, alloca tor will free memory to create more contiguous zones.

Answer 9

* Subdivides the area in half over and over until it finds a memory region that is the size of the request + Aggregation works well ad is fast - Internal fragmentation can still exist because this only allocates by powers of 2

Answer 10

Builds custom object caches on top of slabs [contiguously allocated physical memory]. Each slab is built for a specific size object, so freeing objects in slab is fine b/c they can be replaced by objects of the same size easily. + Internal fragmentation avoided + External fragmentation not really an issue Linux uses a combination of both Slab and buddy Allocator.

Answer 11

When the valid bit is set to 0 in the page table (eg, it's not in memory), then a request to access it becomes a trap and control passes to the OS. The OS makes a call to retrieve that piece of info from the backing storage (I/O request). The memory is placed in DRAM, the page tabled entry is updated with the new PA corresponding to that VA and the valid bit is set to 1. Then control is passed back to the thread and it will execute the same memory request again.

Answer 12

When: * When memory usage is above a threadshold (high watermark) * When CPU usage is below a threshold (low watermark) Which: * pages that won't be used * LRU: uses access bit to tell if it has beena ccessed * pages that don't need to be written out * dirty bit to track if modified * won't swap non-swappable pages Linux: * 'Second Change' variation of LRU

Answer 13

When a new process is created we would copy the original address space. Instead, we map a new VA to the original page. Then we write protect that data. If only reads occur, then this is fine, we only need one copy of the information. If a write occurs, a trap is created and the data is copied to a new PA and the VA updated. This ensures we only pay the copy cost when we need to modify data.

Answer 14

* Periodically save process state during execution * Write protect entire PA and copy everything once * Copy diffs of 'dirtied' pages for incremental checkpoints * rebuild images of process from multiple diffs, or in background aggregate diffs into a single one More @ P3L2: 22 Failure Management Checkpoint

Answer 15

Pipes - bytestream from one process to another Message Queues - just what it sounds like Sockets - send(), recv(), socket()

Answer 16

OS Maps same PAs into VAs of two processes. May not have the same VAs in each process + System calls only for the setup phase + potentially reduced data copies (but no eliminated) - must use explicit synchronization - must have some kind of communication protocol - shared buffer management - programmers responsibility

Answer 17

Copy * CPU cycles to copy data to / from port for each message Map * CPU cycles to map memory into address space * CPU to copy data to channel + memory map is costly, but only needs to be done once. If used many times -> good payoff + Can still perform well for one time use, especially with large data transfers In Windows, they support Local Procedure Calls (LPCs) which means if memory to be copied is small, it would use a messages based IPC and if it's large, it would use shared memory (copy once, then map that to shared space for both processes).

Answer 18

Like a mutex, except that instead of relinquishing the cpu over once it encounters the locked spin_mutex, it instead just sits and checks over and over until it either acquires the lock or the thread is preempted. The most basic synchronization construct.

Answer 19

Like a Traffic Light: STOP and GO - similar to a mutex, but they are more general on init: - represented as a positive integer value -> maximum count ``` on try (wait): - if non-zero -> decrement and proceed -> counting semaphore ``` ``` on exit (post): - increment ``` A semaphore with the max value of 1 is like a mutex - it's a binary semaphore.

Answer 20

Locks that allow multiple access or single access at a time: - read_lock/read_unlock --> multiple access (shared) - write_lock/write_unlock--> single access (exclusive)

Answer 21

They specify - high level sync. construct: - shared resource - entry procedure - possible condition variables

Answer 22

no-write: The write is performed on memory only. The cached copy will be invalidated. write-through: write is applied to the cache and the memory write-back: Write is applied to cache, but the write to memory is done at some later time, for instance, when that cache line is evicted.

Answer 23

Whether the caches are kept in sync. non-cache-coherence (NCC): software must keep the caches in sync. Hardware doesn't do anything about it. cache-coherence (CC): hardware will make sure the caches are all synchronized.

Answer 24

write-invalidate: If one mem location is changed in one cache, the same entry in another cache is invalidated. + Lower bandwidth + amortizes cost write-update: When value is written to one cache, hardware updates it in another cache. + data immediately available to all CPUs b/c it's updated in their cache. These are properties of the hardware - no option to choose in software.

Answer 25

* Multiple instructions which are un-interruptable. * Atomics are always issued to the memory controller. They bypass the cache. + can be ordered and synhronized - can take MUCH LONGER since they are using memory instead of cache. - generates all the traffic necessary to now update all caches since the caches were initially bypassed

Answer 26

test-and-set: + minimal latency (just the atomic) + potentially min delay (spinning continuously on atomic) - processors go to memory on each spin (since it's an atomic) test-and-test-and-set (spin on read/spin on cached value): + Latency is OK (slightly worse than test-and-set) + Delay is OK (slightly worse than test-and-set) - Contention ... better but: - NCC - no difference - CC-Write update - OK - CC-Write Invalidate - HORRIBLE - contention due to atomics + caches invalidated == more contention

Answer 27

Add a delay when the thread notices the lock is free - so after it's released. + Contention is improved + Latency is OK - Delay is much worse

Answer 28

Start P3L5 0 I/O Management

Answer 29

- Response for all aspects of device access, management, and control. - Need a device driver for each device type - provided by device manufacturers per OS/version - Each OS standards the interfaces to the device type

Answer 30

- Response for all aspects of device access, management, and control. - Need a device driver for each device type - provided by device manufacturers per OS/version - Each OS standards the interfaces to the device type

Answer 31

Block: disk - read/write blocks of data - direct access to arbitrary block Character: keyboard - get/put character Network Devices: Stream

Answer 32

via special device file - On Unix-like systems, devices appears as files under /dev directory. - treated by special file systems: tmpfs, devfs (not stored in real file system)

Answer 33

Interrupts - interrupt handling steps (overheads, indirect cache pollution, copying masks, etc) + can be generated as soon as possible Polling + when convenient for os - delay or CPU overhead

Answer 34

CPU issues instructions by writing into cmd registers of device, controlls data movement by read/write to device data register. Example: NIC sending data - 1500B packet, device data register=8 byte (or bus is only 8bytes) - 1 access to device register to write command - 188 accesses for data - > Total of 189 instructions Does not require special hardware

Answer 35

DMA - Direct Memory Access - Needs special hardware support (DMA controller) - CPU 'programs' the device - via command registers - via DMA controls Example: NIC sending data - write command to request packet transmission - configure DMA controller with in-memory address and size of packet buffer (location + total amount of data to move) - from CPU perspective, only 1 store instruction + 1 DMA configuration - - Less steps, but DMA config is more complex - smaller transfer, PIO is still better For DMA's, data buffer must be in physical memory until transfer is complete - not swappable!

Answer 36

1. User Process (send data), makes system call 2. Kernel (form packet), in-kernel stack 3. Driver (write Tx request Record, eg, issue commands) - driver invocation 4. Device 5. Perform Request - For some devices you can bypass the Kernel using Operation System Bypass. Driver must be a user level driver or library. - OS still retains course-grained control - relies on device features such as having sufficient registers, demux capabilities so it can know which process to send data to (single device, many processes using it).

Answer 37

A layer that hides from applications the details of the underlying file system. Underlying file systems could have remote servers, local file systems (of different types), etc. file == elements on which VFS operations file descriptor == OS representation of file (has op's like open, read, writer, lock, close etc) inode == persistent representation of file, the 'index' - list of all data blocks for file - device, permissions, size dentry == directory entry, corresponds to single path component (soft state, in memory only) - /users/geoff -> /, /users, /users/geoff - dentry cache superblock == filesystem-specific information regarding the FS layout - maintains overall map of FS - inode blocks, data blocks, free blocks More in P3L5.22/23

Answer 38

Each Block Group: * superblock: #inodes, #diskblocks, start of free blocks * group descriptor: bitmaps, # free nodes, # directories * bitmaps: tracks free blocks and inodes * inodes: 1 to max_nambuer, 1 per file * data blocks: file data

Answer 39

Index of all disk blocks corresponding to a file - file: identified by an inode (inode == a unique number) - inode: list of all blocks + other metadata ``` Example: filename: "jeep" index block = inode = 19 inode contains blocks: - 9, 16,1, 10, 25, -1, -1, -1 ``` If needed more storage for file, FS allocates a free block and updates inode (list of blocks), replacing the next -1 with the block id. + easy to perform sequential or random access - limit on file size - 128B inode, 4 byte block ptr - 32 addressable blocks x 1kB block - 32 kB file size

Answer 40

inodes contain - metadata - pointers to blocks normally, blocks are pointed to by their direct pointer The indirect pointer points to a block of pointers (instead of to a single block) Could even do a block of block of points (double indirect points) This can keep going... + small inode -> but address large files - files access slow down due to traversal of layered blocks (double indirect = 4x accesses)

Answer 41

caching/buffering - reduce # disk accesses - buffer cache in main memory - read/write from cache periodically flush to disk (fsync) I/O scheduling - reduce disk head movement - maximize sequential vs random access (eg, write block 25, write block 17 -> write 17, 25) Prefetching - increases cache hits - leverages locality (eg read block 17, ... also read 18 and 19) Journaling/Logging - reduce random access - 'describe' write in log: block, offset, value - periodically apply update to proper disk locations

Answer 42

An efficient, isolated, duplicate of the real machine. Allows many full OS's to exist on the same machine and that OS thinks it owns the hardware. But maybe it doesn't get ALL the memory - eg, the system has 16GB but the VM only gets 2GB.

Answer 43

1. provide environment essentially identical with the original machine (Fidelity) 2. programs show at worst only minor decrease in speed (performance) 3. VMM is in complete control of system resources (safety and isolation)

Answer 44

+ Consolidation - decrease cost - improve manageability + Migration - availability, reliability + Security (isolation) + Debugging (test new OS features) + Support for Legacy OSs (no longer needs specific hardware to run)

Answer 45

Hypervisor based; type 1 * Hypervisor Based, manages all hardware resources and supports execution of VMs * Generally runs a Privileged Service VM to deal with devices and other configuration and management * Examples: * Xen: (open source or Citrix Xen Server) * Dom0 = service VM - drivers run here * DomU's = Gues VMS * ESX (VMWARE) * many open API's * Drivers in VMM * Used to have Linux control core, now remote APIs Rewatch P3L6.8

Answer 46

* Host OS owns all hardware * special VMM module provides hardware interfaces to VM's and deal with VM context switching Example: KVM (Kernel-Based VM) * based on linux * KVM kernel Module + QEMU for hardware virtualization * Leverages Linux Open-Source community

Answer 47

Normally: Ring 0: highest privilege (OS) ... Ring 3: lowest Privilege (Apps) non-root: ring 3: apps ring 0: OS root: ring 0: hypervisor VMExit switches to root mode, execution done, then VMEntry goes back to non-root

Answer 48

For non-privileged operations: you get hardware speeds - just like native OS privileged operations: trap to hypervisor Hypervisor will 'trap and emulate' if privileged access is required. Basically intervenes and does the op for the Guest OS if it's allowed, so the guest OS felt like it did it ... when really the hyperivisor did.

Answer 49

pre-2005 they didn't have root and non-root modes. They just had ring0 - ring 4. They would operate the hypervisor on Ring 0 and the OS on ring 1. There were 17 instructions that when executed from anything but Ring 0 would fail silently. They wouldn't pass control to Ring 0 first.

Answer 50

goal: full virtualization == guest OS not modified approach: dynamic binary translation 1. inspect code blocks to be executed 2. if one of bad instructions is present, translate to alternate instruction that emulates the behavior (possibly even avoiding the trap) 3. Otherwise run at hardware speeds Cache the translated blocks to amortize translation costs

Answer 51

Guest OS is modified so that is knows it's running virtualized - it makes explicit calls to hypervisor (hypercalls) instead of those calls that would normally not be trapped. - behave similarly to system calls

Answer 52

``` Full Virtualization: Option 1: - Guest OS Page Table: VA => PA - Hypervisor: PA => MA This is too slow! ``` Option 2: - Guest OS Page Table: VA => PA - Hypervisor Shadow Page Table: VA => MA Watch: P3L6.19 ParaVirtualization: - Guest is aware of virtualization - No longer strict requirement of contiguous physical memory starting at 0 - explicitly registers page tables with hypervisor - can 'batch' page table updated to reduce VM exits * Overheads eliminated /reduced on newer platforms! (paravirtualization)

Answer 53

For CPU's/Memory - less diversity at the instruction set architecture level. standardization of interface For Devices - high diversity. lack of standard specification of device interface and behavior.

Answer 54

Pass Through Model: * VMM-Level driver configures device access permissions + VM provided with exclusive access to the device + VM can directly access the device (VMM-bypass) - device sharing is difficult - Guest VM is mapped directly to a specific piece of hardware - VMM must have exact type of device as what VM expects - VM migration is tricky - b/c mapped to specific piece of hardware, new system would need to have that hardware. Can't take state with it. Hypervisor Direct Model: * VMM intercepts all device accesses * Emulate device operation * translate to generic I/O op, traverse VMM resident I/O stack, invoke VMM-resident driver + VM decoupled from physical device + sharing, migrations, dealing with device specific - latency on device accesses - device driver ecosystem integrated with hypervisor. Needs to support all drivers so it can support the operations and is exposed to driver bugs Split-Device Driver Model * device access control split between: * front-end driver in guest VM (device API) * back-end driver in service VM (or host) - Applies only to paravirtualized guests (look this up?) + eliminates emulation overhead + allow for better management of shared devices

Answer 55

Efficiency - All innocuous instructions are executed by the hardware directly, with no intervention at all on the part of the VMM. Resource Control - Must be impossible for arbitrary program to affect the system resources. Equivalence - Any program K executing with a control program resident, performs in a manner indistinguishable from the case where the control program was absent and K had whatever freedom of access to privileged instructions the programmer had intended.

Answer 56

Intended to simplify the development of cross address space and cross machine interaction + higher level interface for data movement and communication + error handling is captured + hiding complexities of cross-machine interactions

Answer 57

- 1. register: server 'registers' procedure, arg types, location... 0. Bind - client finds and 'binds' to desired server 1. call - mclient makes RPC call; control passed to stupd, client mode blocks 2. marshal - client stub 'marshals' arguments (serialize args into a buffer) 3. send - client sends message to server over protocol agreed upon in binding 4. receive - server receives message; passes msg to server-stub 5. unmarshal - server stub 'unmarshals' args (extras args and creates data structs) 6. actual call - server stub calls local procedure implementation 7. result - server performs operation and computes result of RPC operation Similar steps in return

Answer 58

used to describe interface the server exports - procedure name, arg, result types - version # RPC can use IDL that is: - language agnostic (XDR in Sun RPC) - language specific (Java in Java RMI) ** These are ONLY an interface specification, not an implementation **

Answer 59

Encodes the data into some agreed upon format so it can be interpreted on the other side. Marshaling/UnMarshaling routines are provided by the RPC system compiler

Answer 60

Client Determines: - Which server it should connect to (service name, version number) - How it will connect (IP address, network protocol...) There is generally a 'registry' that servers register their services with and clients can query that registry to get the info necessary to find a server that will provide the service they request.

Answer 61

1. No pointers! | 2. Serialize pointer content - copy referenced data (pointed to) structure to send buffer

Answer 62

- Client/Server on different machines - File server distributed on multiple machines - replication (each server: all files) - partitioned (each server: part of files) -- more scaleable than replicated model - both (files partitioned; each partition replicated) - files store on and served from all machines - blurred distinction b/t clients and servers

Answer 63

1. Upload/Download * Client downloads entire file then re-uploads it + local reads/writes at client - entire file download/upload even for small accesses - server gives up control 2. True remote file access * every access to remote file, nothing done locally + file access centralized, easy to reason about consistency - every file operation pays network cost - limits scaleability The better solution is a compromise

Answer 64

1. Allow clients to store parts of files locally (blocks) + Low latency on file operations + server load reduces -> more scaleable 2. Force clients to interact w/ server, frequently + server has insights into what clients are doing + server has control into which accesses can be permitted (easier to maintain consistency) - server more complex, requires different file sharing semantics

Answer 65

* Stateless: keeps no state - OK w/ extreme models, cannot support 'practical' model - cannot support caching and consistency management - every request self-contained => more bits transferred + no resources used on server side (CPU/MM) + on failure, just restart (b/c stateless) Stateful: Keeps client state Needed for the 'practical' model to track what is cached/accessed + can support locking, caching, incremental operations (guaranteed consistency) - On failure ... need checkpointing and recovery mechanisms - Runtime overheads to maintain state and consistency => depends on caching mechanism and consistency protocol

Answer 66

Unix Semantics => every write visible immediately Session Semantics (b/t open - close => session) * write-back on close(), update on open() * easy to reason, but may be insufficient Periodic Updates - client writes-back periodically => client have a 'lease' on cached data (not necessarily exclusive) - service invalids periodically => provides bounds on inconsistency - augment w/ flush()/sync() API Immutable Files - Never modify, new files created Transactions - All changes atomic

Answer 67

Replication: Each Machine holds all files + load balancing, availability, fault tolerance - writes become complex (synch. to all, write to one and propagate to others) - replicas must be reconsiled Partitions: each machine has subset of files + availability vs single server DFS + scaleability w/file system size + single file writes simpler (write localized to single machine) - on failure, lose portion of data - load balancing hard. if not balanced, hot spots possible Can combine to replicate each partition

Answer 68

Caching: - Session-Based (non-concurrent) - Periodic Updates - default: 3 sec for files, 30 sec for dirs NFSV4 - delegation to client for a period of time (avoids update checks above) Locking: - lease-based - NFSV4 - also 'shared reservation' (reader/writer lock)

Answer 69

+ lower cost - slower access times ~ RDMA is reducing access times

Answer 70

Hardware - dedicated NICs translate memory accesses to messages and pass them to the appropriate node - very expensive, but fast! HIghend/supercomputing only Software - software detects which mem is local/remote and sends messages to get the proper memory - accept messages from other nodes and perform memory operations for them

Answer 71

cache line granularity - overheads too high to DSM variable granularity - too small of granularity - many small variables, too much overhead for transfers page granularity - (OS Level) - os maps pages of virtual memory to physical memory - 4kB is common page size - Amortize cost of remote access with the larger granularities object granularity - (language runtime) - OS doesn't need to be modified, but only supported by languages that support DSM

Answer 72

When two variables exist on the same page, and two systems are concurrently using (writing) to one of them, but doesn't care about the other variable. page = [x y] node 1 only cares about x node 2 only cares about y but from the DSM perspective, they are shared, and will interpret it as concurrent write access to the same piece of memory (page). This will trigger all overheads associated with coherence, invalidation, etc, even though the two nodes don't care about the other piece of memory that's being modified.

Answer 73

single read single writer (SRSW) multiple readers single writer (MRSW) multiple readers multiple writers (MRMW)

Answer 74

Migration - Move data from one node to another for access - Good for single reader single writer Replication - Replicate (cache) data across each node that wants to access a piece of data - Good for multiple readers multiple writers - This will require consistency management - can control this overhead by limiting number of replicas

Answer 75

Push invalidations when data is written to ... - proactive, eager, pessimistic Pull modification info periodically ... - on-demand (reactive), lazy, optimistic When these methods get triggered depends on the consistency model

Answer 76

Page-Based DSM - each node contributes part of mm pages to DSM - all nodes reponsible for part of distributed memory - home node manages access and tracker page ownership - explicit replication possible for load balancing, performance or reliability

Answer 77

Each page (object) has: - address == node id + page frame number - node ID == "home" node Global Map - object (page) id => manager node id - manager map available on each node! Metadata for local pages - per page metadata is distributed across managers Rewatch P4L3.16

Answer 78

updates visible everywhere immediately and ordering of updates must be preserved In Practice though ... - latency + message loss/reorder, impossible to guarantee. only a theoretical model

Answer 79

memory updates from different processors may be arbitrarily interleaved. All processors should see the same interleaving. Operations from the same process always appear in order they were issues.

Answer 80

Causally related writes will maintain order The writes from a single processor cannot be arbitrarily reordered. For 'concurrent' writes (non-causal, unrelated), no guarantees about order.

Answer 81

Uses synchronization primitives - now there are R, W, and Sync operations. When P1 makes a write and then calls sync, P2 won't see those changes until it too calls sync. Then it will be updated by any other changes that may have existed. Other variations: - Single sync operation (sync) - separate sync per subset of state (page) - separate "entry/acquire" vs "exit/release" - acquire = acquire updates made by others - release = release all the updates made by us to others

Answer 82

Functionally Homogeneous => Each node does the same job (boss worker where each worker handles the entire process) + keeps front end simple - how to benefit from caching? No locality benefits. Functionally Heterogeneous => Each node performs a specific function + benefit of locality and caching - more complex Front End (which request should go to which node) - more complex management (more requests of a certain type, can't just add any old node, need to add more of the specific node) - some nodes can end up as hot spots with long request logs

Answer 83

- fungible resources - able to repurpose them - elastic, dynamic resource allocation methods - scale: management at scale, scalable resource allocations - dealing with failures - multi-tenancy: performance and isolation - security

Final Flashcards

(107 cards)