Caching Flashcards
What are two kinds of cache locality?
spatial locality
temporal locality
What is fully associative mapping in caches?What are the advantages and disadvantages?
Full associative is when any block of memory can be loaded in any line in the cache.
Advantage: Most flexible, high hit rate
Disadvantage: Most expensive since all the tags need to be checked in parallel
What is direct-mapped in caches? What are the advantages and disadvantages?
A block in memory can only be placed in 1 particular line given by (block number % no. of lines)
Disadvantage: Least flexible
Advantage: Least expensive (need to only check 1 tag per access)
How do you find the number of offset bits in the address?
no. of offset bits = log2(block size)
How do you find the number of index bits in an address?
For direct-mapped
no. of index bits = log2(no. of lines)
For set-associative:
no. of index bits = log2(no. of sets)
What 3 components is an address broken down into when searching caches?
- tag
- index
- offset
What is the correlation between the no. of pages and the size of the page table (in rows/no. of entries)?
The no. of pages = no. of rows in page table
What is common among blocks in a set?
(Block number % no. of sets) is the same. As a result, the occupy the same set and we only need to search each way within a set
Edit: The block index is the same
Explain the pseudo LRU replacement method.
There is 1 bit reserved for each way in a set. When each way is accessed, set its bit.
If all bits are in a set are 1, reset all but the last accessed
Replace a block with an unset bit
LRU is the best cache replacement policy for _______________
small caches
LRU and random replacement are similar for __________________
large caches
List the replacement policies from best to worst for small caches? Large caches?
Small: LRU, FIFO, Random
Large: LRU and Random, FIFO
What are the two types of write policies? Define each one
- Write-through: Everytime the cache is modified, send it to the lower level.
- Write-back: Only modify the cache and set the dirty bit. The block is written back to the lower level when replaced (only if dirty bit is set)
What is the advantage of write-through cache policy?
Simplifies coherency
List one way of measuring memory performance
Count the number of stalls at commit
What is a “compulsory” miss?
The first time a block is accessed, it won’t be in the cache. It will need to be added and will always be a miss
What is a “capacity” miss?
The whole program can’t be fit into the cache so blocks are replaced and retrieved again
What is a “conflict” miss?
When there is a miss because there is a finite number of blocks in a set. Otherwise wouldn’t occur in a fully associative cache
(When blocks share a set)
What is a “coherence” miss?
When blocks get invalidated because another core is writing/has written to it
What stages can a cache lookup be broken into in order to make a pipelined cache?
set select, tag search, data read, return data
How can increasing block size lead to cache misses?
Large blocks can lead to conflict misses (due to lower associativity) or capacity misses (due to unused blocks in the cache).
How does critical word first work?
If an address is requested, and it indexes the middle of a block, the memory burst starts from the requested word and provides the block, wrapping around to the front of the block. The result burst order could look something like 4, 5, 6, 7, 0, 1, 2, 3
The cache then reassembles the block in order
What is early restart?
The requested data is passed to the processor as soon as it arrives (while the block is still filling)
What is the most common virtual memory and cache combo?
VI/PT (Virtually Indexed, Physically Tagged)
What is a problem with VI/PT virtual memory and cache?
Synonym Problem: Block shared with multiple processes assigned different virtual addresses
What is the Homonym problem
Problem from VI/VT. Some virtual tag can reference multiple memory blocks.
What is Content Addressable Memory used for? How does it work?
CAM is used for parallel tag checking in high associativity caches.
It uses matchlines that are precharged high. A search term (and its complement) is broadcast on the searchlines. If there is a mismatch, the cell discharges the matchline to ground
What is the benefit to physically tagged, physically indexed? Drawback?
Benefit: Simple
Drawback: Slow due to translation
What is the benefit to virtually indexed, physically tagged? Drawback?
Benefit: Faster since we can do cache lookup while translating
Drawback: Synonym problem
What is the synonym problem w.r.t. virtual address translation?
Different processes using same block leads to duplicated blocks because they have different virtual addresses
How is the synonym problem solved?
Page coloring. The OS assigns virtual page numbers s.t. the lower order bits used for the index are the same
What is the homonym problem?
The same virtual tag can reference multiple memory blocks
How is the homonym problem solved?
Adding an address space id (asid) or pid to the tags in order differentiate