Intro To The Work-Span Model Flashcards
Dynamic Multithreading Model
Two parts:
- A computation can be represented by a directed acyclic graph, in which each node is an operation and each edge a dependence.
- Pseudocode notation, or programming model, for writing down the algorithm. Designed so that “when you execute one of these algorithms, it generates a DAG, at least conceptually.”
From the point of view of exploiting parallelism, what is the key property of a “good” DAG?
It has few dependencies.
What separation does the pseudo code notation do?
It separates how to produce work (algorithm) from how to schedule and execute it (hardware).
What is a scheduling problem?
The problem of how to take free units of work and assign them to processors.
Given a complete binary tree with n leaves, how many levels does it have?
log(n)
What is work?
Total number of nodes/verticies in a multithreaded DAG model
What is span?
The length of the longest path through the multithreaded DAG model (from start to end). Also called “depth”.
Notated D(n)
What is another name for the longest path through a DAG?
Critical path
How much time does it take to execute all operations in DAG with one processor, assuming all ops have unit cost?
W(n), i.e., total work
How much time does it take to execute all operations in DAG with infinite processors, assuming all ops have unit cost?
D(n)
Average available parallelism
W(n)/D(n)
Average work per critical path vertex. This is the amount of parallelism we can take advantage of at each critical path vertex, and is a good starting point for choosing an appropriate number of processors.
Span Law
T_p(n) >= D(n)
i.e., Span is the lower bound on the execution time given p processors.
Work Law
T_p(n) >= ceiling(W(n)/p)
Another lower bound on the execution time given p processors. This time it’s total work divided by number of processors.
Work-Span Law
Both the Work Law and Span Law must be true simultaneously, so we can combine them.
T_p(n) >= max{D(n), ceiling(W(n)/p)}
This gives a lower bound on execution time.
How do you derive Brent’s Theorem (which provides an upper bound on execution time of a DAG)?
1) Break up the DAG into k phases, where each phase meets three conditions:
- Each phase has STRICTLY one critical path vertex.
- All vertices within a phase that are NOT on the critical path must be independent, i.e., have no dependencies between them.
- Each vertex must belong to some phase.
2) Each phase k has a number of vertices in it, W_k (including critical path vertex)
3) By condition three of our phase conditions, the sum of W_k for all k = W, total work for the DAG.
4) How long does it take to execute phase k? t_k = ceiling(W_k/p)
5) 4 implies that the total time T_p = sum from k=1 to D of ceiling(W_k/p)
6) We can algebraically rearrange the ceiling of the quotient in 5 to turn it into a floor: total time T_p = sum from k=1 to D of floor( (W_k-1) / p) + 1
7) floor(x) is always <= x, so T_p <= sum from k=1 to D of (W_k-1)/p + 1
8) 7 implies that T_p <= (W - D)/p + D! Brent’s Theorem!