Structure of work, patterns, paradigms Flashcards
task
A task ti is a sequential computation (strand) that will not be analyzed further. A task requires input data (from other tasks) and produces output data (to other tasks). A task cannot be executed before the input data are available.
Task data dependencies are captured by a DAG
ready node
node in dag without predecessor
schedule
assignment of tasks to processors (scheduling is np-hard)
span or depth
Definition: “Span” or “depth” is
T∞(n) = ∑heaviest path: Wi(n)
Lower bounds on achievable parallel time
- Work law: Tp(n) ≥ W(n)/p (≥ Tseq(n)/p)
- Depth law: Tp(n) ≥ T∞(n)
- Parallelism (speed-up range, max speed-up): W(n)/T∞(n)
Sp(n) = Tseq(n)/Tp(n) ≤ Tseq(n)/T∞(n) ≤ W(n)/T∞(n)
Higher lower bound: Work-depth analysis:
One processor working on T∞, the remainder p-1 on the remaining, perfectly parallelizable work
Tp(n) ≥ (W(n)-T∞(n))/(p-1)
greedy dag scheduler
Greedy DAG scheduler:
Assign as many tasks as possible in each step to the available
processors.
Assume p processors are given. A step in which at least p tasks are ready or being executed is called a complete step. A step in which less that p tasks are ready or being executed is called incomplete.
Any greedy scheduler executes a DAG with work W and depth T∞ in Tp ≤ W/p+T∞ time steps
Corollary:
The running time Tp achieved by a greedy scheduler is within a factor of 2 from the time that can be achieved by an optimal schedule
Bernstein’s Independence
Bernstein’s Independence (sufficient) Conditions:
The fragments Pi and P j are independent and can be
executed in parallel if
I input, O output
- Oi intersection Ij = Ø (Flow dependency, true dependency)
- Ii intesection Oj = Ø (Anti dependency, Pj overrides input of Pi)
- Oi intersection Oj= Ø (output dependency)
Dependencies between loop iterations are called loop carried dependencies.
todo
ab s74