6.0 The memory system Flashcards

Question

What is write allocate

Answer 1

Insert written value into cache on a write miss

Answer 2

Bypass cache on a write miss. If the value is not in cache, write straight to the lower level memory.

Answer 3

Write-back + write-allocate Write-through + write no-allocate

Answer 4

Allcate-on-fill Allocate-on-miss

Answer 5

Insert the requested data into the cache, when the miss returns from lower levels. You can still hit in all ways while the miss is being serviced (may improve performance) Higher complexity

Answer 6

When you miss, you already evict and reserve this block for the load.

Answer 7

Decides what to evict on a miss. On a fully- or set-associate cache, there are more than one cache line that can be evicted. Random, FIFO; LRU

Answer 8

Choose a random block Less effective, very simple

Answer 9

Evict block inserted longest ago More complexity: need to remember insertion order Medium complexity, but more effective

Answer 10

Evict least-recently-used Need to remember the reuse order, which is updated on every access When you access a block, it is moved to the top of the stack/queue. On a miss, evict the tail of the queue within the set Most complex, most effective. Can be expensive in high-associative caches

Answer 11

Any block cached at level n is also cached at the level n+1 cache Pro: This simplifies coherence: if block is not in level n, it is not in level n-1 Con: -Reduces cache utilization, as we are in prectise storing redundant copies. - Adds some overhead

Answer 12

Any block in level n cache, is not in level n+1 cache. Simplifies coherence - know a block is in most 1 level Does not reduce cache utilization as blocks are stored in most 1 cache

Answer 13

No, an option is to not enforce inclusivity or exclusivity. Let the cache coherence deal with the fact that any block can be cached anywhere in the hierarchy

Answer 14

The key component is the MSHR MHSR: - Miss Information/status Holding Register - On a miss, go into the miss handling architecture (MHA) - Ask if there is a MHSR outstanding for this block address - If there is, that means there is already a miss in progress, and you allocate an entro into space called target information saying that this request also requested this word from this cache block. - When the miss returns, cache goes through the target information and responds to all words The cache can service as many misses as there is MSHRs - selecting the number of MSHRs is critical If the cache runs out of MSHRs it needs to block and cannot handle either hits or misses. Because there is no where to put the miss data on misses

Answer 15

Enables servicing cache hits while a miss is being resolved(hit-under-miss policy) Enables hiding memory latency by servicing multiple misses concurrently (memory-level-parallelism MLP) Enables merging concurrent requests to the same cache block

Answer 16

Every process sees a 23/64-bit address space. Physical address space is much smaller Virtual addresses needs to be mapped into physical ones Because every process have the illusion that they own the complete address space, time sharing and demand paging is much easier.

Answer 17

Being able to switch between processes at runtime

Answer 18

Hardware figure out if a page is available in main memory or not. If it is not, it might not have been allocated yet. In this case, the software needs to load from disk. This is very expensive, so instead of blocking the processor, we generate an exception (page fault). When a page fault occurs, the OS schedules the process out, finds the page to be replaced, load from disk. Meanwhile, as this is non-blocking I/O, other processes is run. When the page is ready the process is set as runnable again, and can be scheduled.

Answer 19

Take the virtual address and do the address translation in hardware. From the address translation we get a physical address that is used to access main memory. Allocate memory in fixed sized chunck called pages (4KB or 8KB)

Answer 20

Allocate memory in fixed sized chunck called pages (4KB or 8KB)

Answer 21

Key data structure to support virtual memory. Maintained by the OS Two types: - Forward page tables - Inverted page tables

Answer 22

Have a pointer from any virtual address to a physical address. This is inefficient, as the virtual address space is much bigger than the physical one

Answer 23

One entry for each physical page in memory

Answer 24

Hardware implementation of inverted page tables. Small hardware structure that is highly associative, with typical 64 entries. Essentially a small cache for the page tables in memory. Often have seperate I-TLBs and D-TLBs

Answer 25

Need to grab the address translation from the page table that is stored in memory (this can be done both in HW and SW) Note that sometimes the page table might be too large to be kept in memory - then the OS might need to be involved to get it resolved.

Answer 26

Important that a process cannot access physical memory that is not allocated to it. Because of this, we must perform the address translation before we send the request to physical memory.

Answer 27

Tag and index of virtual address is used to access TLB The TLB gives a physical page number. Add this physical page number to the page offset from the virtual address. This gives the physical address

Answer 28

Tag: Used to access TLB Index: Used to access TLB Page offset: Address within page

Answer 29

Cache Tag Cache Index Block offset All used to access cache

Answer 30

When running multiple processes, we have multiple pages from multiple processes in memory. Don't want processes to change memory of each other. The different processes might use the same virtual addresses, but the OS makes sure they map to different parts of the physical memory. Processes might share some memory (shared library). In this case we don't want a copy in each processes memory. Instead the virtual addresses will map to the same part of memory.

Answer 31

Make sure the index-bits and block offset of the physical address, is set within the page offset of the virtual address. The page offset is the same in the virtual- and the physical address. In this case we can lookup the memory in cache (using the index and block offset from page offset of virtual/physical address) at the same time as the TLB lookup happens using the index and tag from the virtual address. The physical page number from the TLB becomes the tag of the virtual address. Now compare this tag to the tags from the cache and determine if we had a hit.

Answer 32

Address calculation Translation Mem access

Answer 33

Loads need to wait for the register with the address info

Answer 34

Needs to wait for 2 registers (address + value)

Answer 35

Address calculations are done in the FUs Second stage: Then access TLB to get translation. Third stage: - Load: Read value from cache/mem and write to reg - Store: Write value to store queue. Later into store buffer on completion.Yet later written to cache/mem on retirement If we were to use virtually indexed, physically tagged. We would do these checks on virtual addresses.

Answer 36

Loads: Do not handle page fault unless we are on correct path Stores: Because of in-order completion, memory state is not updated speculatively

Answer 37

RAW: loads may read data before a store has writte their value WAW and WAR cannot occur because writes happen in program order

Answer 38

Load bypassing Load forwarding

Answer 39

Move load before prior stores

Answer 40

Take the value being stored and forward it to the dependent load without accessing memory.

Answer 41

Stores are first in reservation station Stores are then issued to a store-unit When stores have the address and the data, then the store is finished. When the load becomes the oldest instruction in the ROB it becomes completed Whenever a cache is ready to receive the store it gets retired and the data and address gets written to cache.

Answer 42

Loads are first in reservation station Loads are then issued to a load-unit. The load sits in the load buffer while the address is being looked up in cache. The cache then returns with the data that is written to a register. When the load becomes the oldest instruction in the ROB and is ready to commit, the register will change into a architecturally visible register. Loads are often given priority above stores, as they often are on the critical path.

Answer 43

Most recent value may not be in cache/memory, but in store queue/buffer Need to search the store queue/buffer for the most recent value of that same memory location if we hit in store queue, we forward data to the load Needs comparators to search store queue Possible multiple stores with the same address, need to pick most recent one This is Content Addressable Memory

Answer 44

Store prior to a dependent load might not have executed Store might be in reservation station or in flight execution In these cases, the value may not yet be in queue/buffer Even worse, the address may not even be known yet

Answer 45

Add a structure called the finished load buffer Whan a load has gotten the data from cache/memory, the address and data is put in the finished load buffer. When the load completes, i.e. is the oldest instruction, we can remove it from the finished load buffer. In the meanwhile, when the stores complete we search through the finished load buffer for a load with the same address as the store. If we find that the load was dependent on the store, and has loaded the wrong data. All the instructions that were dependent on this load, i.e. used theloaded data, will be rolled back. These are then re-executed with the correct data. This is expensive, so we can implement some prediction to avoid this

6.0 The memory system Flashcards

(69 cards)