Out-of-memory Flashcards
Disk
traditional storage medium
* rotating magnetic disk
* large (random) seek times
* can overwrite in place
* cheap but slow
Solid State Drives (NAND Flash)
based on semiconductors (no moving parts), usually NAND gates
* nowadays directly attached to PCIe (M.2/U.2) through NVMe interface
* an SSD consists of many flash chips
* higher bandwidth than disk and much faster random access
NAND Flash Organization
- data is stored on pages (e.g., 4KB or 16KB)
- pages are combined to blocks (e.g., 2MB)
- not possible to overwrite a page, must erase the full block first
Zoned NameSpaces (ZNS)
- Zoned Names expose out-of-place nature of device
- device is split into zones, each zone can only be written to sequentially
- requires application to implement garage collection
- can reduce write amplification
Persistent Memory Failure Atomicity
- PMem stores are buffered in the CPU caches
- programs cannot prevent eviction
- but can force eviction using explicit cache write-back or flush instructions
FP-Tree
- persistent B-tree optimized for PMem
- inner nodes have conventional layout and are stored in DRAM
- on crash inner nodes are recovered from leaf nodes
- leaves are unsorted
- “fingerprints”: 1-byte hash at the beginning of the node speeds up search
Btree
- the standard out-of-memory data structure is the B-tree
- large fanout minimizes number of disk accesses
- usually combined with a buffer manager (i.e., page cache)
- fixed-size nodes work well with SSD/disk and cache
Log-Structured Merge Trees (LSM)
- LSM trees can improve write amplification (in particular for random-write
workloads) at the cost of read amplification - basic idea:
- out-of-place writes
- periodic merges of sorted runs
Important Optimizations
- in-memory Bloom filter for each run:
- only need to access run if Bloom filter has true or false positive
- depending on desired false positive rate, takes 4-16 bits per key
- partitioning: split runs into independent ranges
- reduces peak memory consumption
- makes large merges less invasive
- may reduce write amplification for non-random access patterns
- works for both merging policies
Why do we need cache coherency protocol?
to provide the illusion of a single main memory
what cache coherency protocol did we discuss in the lecture?
what does this acronym stand for?
MESI protocol, which has the following states:
* Modified: cache line is only in current cache and has been modified
* Exclusive: cache line is only in current cache and has not been modified
* Shared: cache line is in multiple caches
* Invalid: cache line is unused
what guarantees does C++ give you when a data race occurs?
sequencial consistency
- what are the default ordering guarantees for std::atomic
non-atomic loads and stores are not reordered around atomics
- how many bytes can be atomically handled on x86-64?
atomic operations only work on 1, 2, 4, or 8 byte data that is aligned
- what memory ordering does x86-64 use?
Total Store Order