Intro Caches Theory Flashcards
What is it the main goal of a cash memory?
To increase the performance of a computer through the memory system in order to:
- Provide the user the illusion to use a memory that is simultaneously large and fast
- Provide the data to the processor at high frequency
Why does memory hierarchy design becomes more crucial with multicore processors?
Due to the aggregate peak bandwidth growth related to the increasing in the number of cores
What is temporal locality?
When there is a reference to one memory element, the trend is to refer again to the same memory element soon
What is spatial locality?
When there is a reference to one memory element, the trend is to refer soon at other memory elements, whose addresses are close by
How do caches exploit both types of locality?
Exploits temporality by keeping the contents of recently access memory locations
And spatial locality by fetching blocks of data around recently accessed memory locations
What is the minimum of data that can be copied into the cash?
The block or cash line
What is the relationship between the block size in the cash memory and in the main memory and what is the rationale behind it?
The block sizing the cash memory must be multiple of the word size in main memory. The reason behind it is to better exploit the spatial locality.
How do we calculate the number of blocks in a cash?
We divide the cash size by the Block size
Define a cash hit
It is when the requested data is found in one of the cash blocks
Define a cash miss and how it is handled
It is when a requested data is not find in one of the cash blocks. In case of a miss we stall the CPU, then we require the block from the main memory, copy the block in cash and repeat the cash access
What is the hit rate and how do we calculate it?
It is the number of memory access that find the data in the upper level with respect to the number of memory accesses.
It is calculated by the amount of hit accesses divided by the amount of memory accesses
What is the hit time?
It is the time to access the data in the upper level of the memory hierarchy
Define a miss rate
It is the number of memory accesses not finding the data in the upper level with respect to the total number of memory accesses. It is therefor calculated as the division between the misses and the total memory accesses
Define and explain, how is the miss time calculated?
The miss time is the sum of the hit time and the miss penalty, where the penalty is the time needed to access the lower level and to replace the block in the upper level
What is the average memory access time and how it is calculated?
It is the sum of the hit rate and miss rate by its respective hit and miss time. It can be calculated ass the hit time plus the miss rate times the miss penalty (Hit Time + Miss Rate * Miss Penalty)
Based on how the average memory access time is calculated, how can we improve cash performance?
We can improve cash performance by reducing either the hit time, the miss rate or even the miss penalty
What is the rational behind the division of the cash into instruction and data cash
Since there is a higher spatial locality in instructions, the miss rate of an instruction cache would be much lower than the miss rate of a data cash
Define a validity bit (and it’s relation with the bootstrap process)
Validity be it is used to indicate if the content of the cash block or line is valid or not. At the bootstrap, all the entries in the cash are marked as invalid.
Define the cash tag
It contains the value that univocally identifies the memory address corresponding to the stored data
Where can a block be placed in the upper level? Given the address of the block in the main memory, where the block can be placed in the cash.
The correspondence between the memory address and the cash address depends on the cash structure, which can be direct mapped, fully associative, and N-way set-associative
Describe the direct mapped cash and its functionalities. Describe how the memory address in the cash is composed.
In the direct mapped cache, each memory location correspondence to one and only one cash location.
The block address is composed by a tag and index. The full memory address also contains a block offset to control this specific data that was requested.
This describe how is it done the verification of a given address with respect to a direct mapped cash structure
First, the cash verifies the index in the requested data in the memory location, which will correspondence to one and only cash line. Then he verifies the tag in order to check for the correspondence off the stored data. Then it check the validity of the data, and finally using the block offset it extract the specific requested data in the memory block.
Describe a fully associative cash and its memory composition
In a fully associative cash The memory block can be placed in any position of the cash, therefore, all the cash blocks must be checked during the search of the block.
The cash memory is composed by the block address, which contains the tag for memory correspondence verification, and block offset
This describe how is it done the verification of a given address with respect to a fully associative cash structure
The associative cash checks first for the tag correspondence between the cash lines and the requested data memory address, then it checks the validity of the data in the cash, and finally uses the offset to extract the specific data requested
Describe the N-way set associative and its memory address composition
Cash is composed of set, each set composed of N blocks, which is related to the cash and block size. Memory block can be placed in any block of the set, and therefore the search must be done on all the blocks of the set.
This describe how is it done the verification of a given address with respect to a n-way set associative cash structure
The cash uses the index in the block address memory requested to identify the correct set in which that memory address could be stored. Then inside this set it compares every cash line with the requested tag, then check the validity of the line, and finally it uses the offsets to extract the specific data requested