6.0 The memory system Flashcards
What is the memory wall
Processor is much faster than memory fetching
How is memory latency hidden?
Using a memory hierarchy
What does a memory hierarchy provide?
Hide latency
illusion of infinite capacity
relatively low cost
What are the components in the memory hierarchy?
Register file (SRAM)
L1 cache (on-chip) (SRAM)
L2 cache (on-chip) (SRAM))
Main memory (DRAM)
Hard drive (Magnetic)
Why does memory hierarchies work?
Programs have locality
Spatial, temporal
Move frequently used data and instructions close to the processor (in registers - done by compiler, in cache - done by hardware).
What is temporal locality?
Accesses to the same location are likely to occur in near time
What is spatial locality?
Accesses to nearby locations are likely to happen in near time
What is a cache?
Consist of a number of equally sized locations (block/line)
What are the three types of caches?
Direct mapped
Fully associative
Set associative
What is a direct mapped cache?
Each block address is mapped to exactly one cache location
index = block_addr % block_count
Tag: part of the address that accesses the cache
More efficient lookup, but more conflict misses
What is a fully associative cache?
All blocks can be stored anywhere in the cache.
No index
Need to search the whole cache
What is a set associative cache?
Each block can map to n locations in the cache. Need to look up the index of the block in each set to be sure if the address is stored in cache
sets = block_count / n
index: block_address % set_count
less prone to location conflicts compared to direct mapped caches
What does the cache block address consist of?
Block address (tag + index)
Block offset: addresses individual bytes within a block
What are the trade offs when choosing block size?
Large blocks:
- Greater oppurtunity for spatial locality
- More likely that large portions of the block are unreferenced before the block is evicted
- Take longer time to read into a cache - increased miss penalty
What is the 3C miss classification?
Compulsary misses
Capacity misses
Conflict misses
What are compulsary misses?
Misses that still would occur in an infinite sized cache.
I.e. the first access to a block
What are capacity misses?
Misses that would occur even if the cache was fully associative.
The amount of data access within a time frame is larger than the cache capacity
What are conflict misses?
Misses that occur because the cache is not fully associative.
What are seperate caches?
Used when we know the caches have specific functions (i.e. data- and instruction caches)
Instructions have the same format, and are only read
Ofen used for L1 caches. Instructions have very good locality, so L1 will grab most of the potential. Meaning we don’t need to use split caches further down the memory hierarchy.
Avoids structural hazards for simple pipelines
What is a write buffer
When a cache needs to write, it writes the data into the write buffer.
This write buffer can continue writing into memory in the background, so that the pipeline does not need to stall.
What do we need to think about when designing caches?
- How to deal with writes
- How to allocate space in cache for writes
- How to allocate space in cache for reads
- What block to evict on a miss
- Should we restrict what data can be placed in higher level caches?
- How many concurrent misses should we support?
What are the different ways of handling writes?
Write-back
Write-through
Write-allocate
Write-no-allocate
What is write-back?
Update cached value, if it is there, and write cache value to memory at a different time.
Betting on that there will be multiple writes to the same cache line, and therefore is better to do one big write instead of multiple small ones.
Tradeoff: Save bandwith, but adds complexity
What is write-through
Write updated data to lower-level memory on all writes, in addition to updating cache.
What is write allocate
Insert written value into cache on a write miss
What is write no-allocate
Bypass cache on a write miss.
If the value is not in cache, write straight to the lower level memory.
What are typical combinations of write-handling?
Write-back + write-allocate
Write-through + write no-allocate