Out of Order Pipelines (Memory) Flashcards
What are the 4 sections of program data? What is stored in each of them
i. text (program literals)
ii. data (global variables)
iii. heap (dynamically allocated memory and large structures)
iv. stack (local variables AND spilled registers)
What is an address generation unit (AGU)?
It is a dedicated functional unit that is used to compute the effective address (EA) before performing loads/stores
How do you speed up the calculation of effective addresses in out of order pipelines?
Using an address generation unit (AGU)
How are stores carried out in out-of-order pipelines?
- Calculate EA
- Put EA + data in store buffer
- After commit, retire to D-cache when D-cache is idle
How are loads carried out in out-of-order pipelines?
- Calculate EA
- Read data from D-cache or SB
- Broadcast data on forwarding bus
How are exceptions handled in memory operations of out of order pipelines?
When a faulting instruction reaches the head of the ROB, any finished stores are flushed and any committed stores are permitted to retire
What is the difference between a finish and a commit in out-of-order execution?
In a finish, the data is simply broadcast on the forwarding bus. On a commit, it is saved to the destination register.
What are the steps for speculative loading?
- Calculate EA
- Use data from most recent aliased store (otherwise get data from d-cache)
- Add to finished load buffer (FLB)
- On commit, remove from FLB
What are the steps for speculative storing?
- Calculate EA
- Put EA + data in SB
- On commit, if aliased load is present in FLB, squash it
- Retire to D-cache when idle
What does the “tag” in the FLB represent?
The PRF (physical register file) number
What is the disadvantage of speculative loads?
Valid loads may be squashed
What is the disadvantage of using software prefetching?
The compiler needs to know where to insert them in code
What is streaming (with reference to prefetchers)?
Loading consecutive blocks
What is striding (with reference to prefetchers)?
Loads with regular stride (e.g. reading a column of array stored in row-major order)
What is a problem with prefetching into the cache? How is it solved?
Erroneous prefetch “pollutes” cache with unneeded data, reducing hit rate. Can be solved by prefetching into a prefetch buffer, loading into cache only if demanded by the program