virtualization Flashcards
What is virtualization?
Decouple software from hardware in a way that allows running
multiple OSes on same hardware
E.g., run both Windows and Linux on same laptop
How is virtualization different from dual boot?
Both OSes run
simultaneously
Yet, by default,
they are completely
isolated from each
other
What’s a Hypervisor or VMM (Virtual Machine Monitor)?
The SW layer that allows several virtual machines
to run on the same physical machine
What’s a Host?
The physical machine & OS that directly controls it
Overload: sometimes we say “host” but we actually mean hypervisor
What’s a Guest (or guest OS)?
The virtual machine OS and all the applications it runs
What are the 2 hypervisor types?
- Bare-metal (Type 1)
Has complete control over HW
Doesn’t have to “fight” / co-exist with OS - Hosted (Type 2)
Avoid functionality/code duplication (e.g., process scheduler, memory
management) – the OS already does all of that
Can run native processes alongside VMs
Familiar environment
• How much CPU and memory does a VM take? Use top!
• How big is the virtual disk? Use ls –l
• Easy management: kill/stop a VM? Sure, just SIGKILL/SIGSTOP it!
What’s a combination hypervisor?
Most is hosted, but some parts of the hypervisor are implemented in the OS kernel for performance reasons.
What are the 4 different ways to run VMs?
- Software emulation
- Trap-and-emulate (with & without HW support)
- Dynamic binary translation
- Paravirtualization
What is SW emulation? pros and cons?
Do what CPU does but ourselves, in software
Fetch the next instruction
Decode (is it an ADD? XOR? MOV?)
Execute (using the SW emulated registers and memory)
Pro: Simple!
Con: Slow
What’s trap & emulate? pros and cons?
Actually, most VM code can execute directly on CPU just fine
E.g., addl %ebx, %eax – if VM1 does this, VM2 doesn’t care
So instead of emulating this code
Let it run directly on the CPU
But some operations are sensitive,
requiring the hypervisor to explicitly lie, e.g., int $0x80, movel rax, CR3; I/O ops
Solution
Trap-and-emulate all these “sensitive” instructions
E.g., if guest runs INT $0x80, trap it and execute guest’s handler of
interrupt 0x80
Pro: Performance!
Con: Not all sensitive ops trigger a trap when executed in user-mode (especially in older architectures)
(We can deal with this by using HW support for virtualization if exists).
What’s dynamic binary translation?
Block of (VM) ops encountered for 1st time?
Translate block, on-the-fly, to “safe” code
• Similarly to JIT-ing (JIT compilation = just-in-time compilation)
Put block in “code cache” (indexed by address)
From now on when program reaches this address
• Safe code would be executed directly on CPU
How translation is done in dynamic binary translation?
Most code translates to something that does exactly the same
• E.g., movl %eax, %ebx
Sensitive ops are translated into explicit “hypercalls”
• = Calls into hypervisor
• (to ask for service)
• Implemented as trapping ops
• (unlike, e.g., POPF)
• Similar to syscall
• (call into hypervisor to request service)
What are the pros and cons of dynamic binary tranlation?
Pros
No hardware support required
Performance is much better than full SW emulation
Cons
Hard (!) to implement
• Hypervisor needs on-the-fly x86-to-x86 binary compiler
• Consider the challenge of getting branch target addresses right
Performance may not be as good as HW-supported trap-and-emulate
What’s paravirtualization? pros and cons?
Requires guest OS to “know” it is being virtualized
And to explicitly use hypervisor services through a hypercall
E.g., instead of doing “cli” to turn off interrupts,
guest OS should do: hypercall( DISABLE_INTERRUPTS )
Pros
No hardware support required
Cons
Requires specifically modified guest
• (Every guest OS should be modified)
What are the 2 ways to virtualize the virtual memory?
- With HW support (EPT/NPT; E=extended; N=nested)
- With “shadow page table”
• Which requires no HW support
What’s a shadow page table? pros and cons?
Hypervisor computes/builds GVA to HPA translations on the fly
Storing them in a new set of page tables (called shadow page tables)
Which are used instead of the inner guest page tables
Pro As noted, requires no HW support Cons Overwhelmingly complex Can be slow due to all the overheads involved
What’s 2D/nested/extended page table?
Processor support two level page tables:
Regular guest page table (GVA => GPA) maintained by guest OS
New second translation table (EPT) from guest physical address
(GPA) to host physical address (HPA) maintained hypervisor
In each level of translation we add 4 translations instead of one, because each translation in the guest is an entire page table walk in the hypervisor.
Is it possible that using shadow PT will yield performance superior to
EPT?
Yes. if EPT requires a lot of mem refs.
How can we emulate a NIC in virtualization? pros and cons?
Emulate some physical NIC in SW (all hypervisors emulate e1000, all guests have an e1000 driver)
NIC’s registers are variables in Hypervisor’s memory
Memory is read/write protected (Hypervisor reacts according to
values being written and read)
Interrupts are injected by hypervisor to guest
Pros
Unmodified guests (all OSes already have a driver for e1000)
Use only one device => robust
Portable across HW & hypervisors (and hence clouds)
Cons
Slow (traps on every register access)
Hypervisor needs to emulate overly complex HW that wasn’t
designed with virtualization in mind (can be much simpler)
How can we use paravirtualization to virtualize NIC? pros and cons?
Paravirtualization
Emulate a “new” device, which isn’t physical in any sense
Guest installs a host-specific device driver
• Denoted: paravirtual device driver
Protocol between frontend (driver installed in guest) and backend
(hypervisor) is optimized for efficiency
Pros & cons Its exactly like emulation Except that it is faster But it requires guest modification, making it less portable • Harder to move between cloud providers • (Every SW modification is a “risk”) Still not as fast as using the actual HW • Because it’s a SW indirection layer
What’s the difference between NIC virtualization Protocol in emulation case and in paravirtualization case?
Protocol in emulation case
Guest writes registers X, Y, then writes to register T
=> Hypervisor infers guest wants to transmit packet
Protocol in paravirtual case
Guest does a hypercall, passes it start address and length as
arguments; hypervisor knows what it should do
What’s the direct access method for I/O virtualization? pros & cons?
Direct device assignment
Pull NIC out of host and plug it into the guest for its exclusive use
Guest accesses device directly without hypervisor intervention
Pro:
Much more performant than paravirtual I/O (which still induces
many context switches)
Cons:
Need device per guest
Plus one for host
Can’t do I/O interposition
What’s IOMMU translation?
נתינת כתובת “וירטואלית” לרכים IO לכתוב אליהם, ואז יש את ה-IOMMU שמתרם אותה לכתובת פיזית. נועד כדי לממש כמה רכיבי IO של כמה “אורחים” שונים.
What’s SR-IOV? pros & cons?
SR-IOV
The ability of a device to appear to SW as multiple devices
Single root I/O virtualization
Contains a physical function controlled by the host, used to create
virtual functions
Each virtual function is assigned to a guest (like in direct assignment)
Each guest thinks it has full control of NIC, accesses registers directly
NIC does multiplexing/demultiplexing of traffic
Pro:
Nearly fast as device assignment
And need only one NIC (as opposed to direct assignment)
Cons:
Emerging standard (few hypervisors/clouds fully support it)
Requires newer hardware
Can’t do I/O interposition