L20 & L21 Flashcards
How has performance optimisations, such as OOE, affected the correctness of a program?
Loads and stores that are issued out of order reach memory out of order. Therefore, the memory becomes inconsitent across cores.
What causes memory inconsistency in optimised code?
out-of-order stores
What are the 3 components of a memory consistency model?
Clear definition of how stores become visible
Clear definition of what is allowed and not
Clear context for HW and SW optimizations
Explain what the release consistency model is.
Release consistency: contract between programmer and system
* Shared access is protected by synchronisation
* Critical section is assumed atomic
* Operation order does not have external effects
What are the 2 types of memory fence/barrier?
Full memory fence: all memory operations are complete when the fence completes
Store memory fence does the same but only for stores
What are the 2 different ways to code a memory fence/barrier
DMB = data memory barrier
LDAR = load-acquire / STLR = store-release
What are the differences between coherence and consistency?
Coherence:
Single memory location
All cores see the same updates
Not interested in the relative ordering of different updates
Done automatically by hardware
Consistency:
About the overall system state
Not interested in keeping data coherent
A contract between programmer, compiler, and hardware
What is a way to enforce partial ordering?
Memory fence/barrier
What is functional programming? What is an important property of it?
Functional programming: transforms inputs to outputs by applying a set of functions
Referential transparency is an important property of functional programming
Can replace a name with its value in the program
No updateable state
What is a higher level language that can be used to translate program to data flow representation?
o SISAL: stream and iteration in a single assignment language
What is the data flow representation components?
Token queue: buffering + inserting new data
Matching store: tokens matched with instructions
Instruction store: coding of the dataflow graph
Processor bank: a number of processors
How can the dataflow representation be represented as a graph?
Instructions = nodes
Data flow = edges
What are examples of pure (“coarse-grained”) data flow?
Task-based computing
OpenMP tasking
OpenStream