Cache and Cache Coherency Flashcards
Shared Memory vs Distributed Memory
Shared Memory:
Pros:
1. Implicit coordination of processors through shared data structures.
2. Appealing programming model for many programmers.
3. Generally suitable for systems with a small number of processors.
Cons:
1. Scaling interconnect can be costly.
2. Conflicts over access to the bus increase dramatically with more processors.
3. Large crossbars, while efficient, are expensive.
Distributed Memory:
Pros:
1. Relatively inexpensive interconnects like hypercube and toroidal mesh.
2. Well-suited for systems with thousands of processors.
3. Better for problems requiring vast amounts of data or computation.
Cons:
1. Requires explicit message passing for coordination.
2. More complex programming model for many programmers.
3. Not as suitable for small-scale systems with few processors.
Modified State
- cache block contains current value
- all other copies contains an invalid value
Shared State
- cache block has not been updated
- cache block contains current value
- all other copies also contains the current value
Invalid State
- cache block does not contain most recent value of
memory block
Bus Read
Generated by read operation of memory block not
in local cache
Bus Read Exclusive
- Generated by write operation to memory block not
in local cache or in local cache, but not in state M - Memory provides the most recent value
- All other copies are marked invalid
Write Back
Cache controller writes a block marked M back to
main memory
Local Write from M, Cache Miss
- Print Write
- Cache Miss
- Flush your values back to main memory
- Bus Read Ex from Main Memory
- Write Change
Local Write from I, Cache Miss
- Print Write
- Cache Miss
- Flush the holder of the values
- Make the holder invalid
- Bus Read Ex from Main Memory
- Local Write
- Make current P M
Read from I, cache miss
- Print Read
- Cache Miss
- Flush holding processor into main memory
- Make that processor and this one shared
- Bus Read
Directory Based Cache Coherence
- Directory-based cache coherence protocols attempt to solve this problem through the use of a data structure called a directory.
- The directory stores the status of each cache line.
- Typically, this data structure is distributed; in our example, each core/memory pair might be responsible for storing the part of the structure that specifies the status of the cache lines in its local memory.
- Thus when a line is read into, say, core 0’s cache, the directory entry corresponding to that line would be updated, indicating that core 0 has a copy of the line.
- When a variable is updated, the directory is consulted, and the cache controllers of the cores that have that variable’s cache line in their caches will invalidate those lines.
Snooping Cache Coherence
The idea behind snooping comes from bus-based systems: When the cores share a bus, any signal transmitted on the bus can be “seen” by all the cores connected to the bus. Thus when core 0 updates the copy of x stored in its cache, if it also broadcasts this information across the bus, and if core 1 is “snooping” the bus, it will see that x has been updated, and it can mark its copy of x as invalid. This is more or less how snooping cache coherence works.