Tree Computations Flashcards
How do you represent a tree using the array pool we used for linked lists?
- Number the nodes and convert the undirected edges to directed edges that point towards node parents
- Create an array P which stores the parent of each node (if node 1’s parent is 8, then P[1] = 8). P[root] = NULL
Describe the sequential algorithm for finding the root of a tree, assuming you have the array pool representation?
if n < 1: return 0
node <- pick_any_node(P[:])
while P[node] != 0:
node = P[node]
return node
i.e., Pick a random node, then follow the pointers to the root (node with no parent).
What is the worst case running time of the sequential algorithm for finding the root of a tree?
O(n)
It’s linear because in the worst case we could have an extremely unbalanced tree, where each node has one child, and we might randomly pick the very bottom node in the tree.
Describe the algorithm for finding the root of a tree in parallel, given an array pool representation?
Inspired by jump lists, explore all nodes in parallel and change each node to point to its grandparent (rather than parent) if the grandparent exists. Iterating this process, eventually all nodes will point to their root.
We need two auxiliary functions:
hasGrandparent(k, P[:])
return (k > 0) && (P[k] > 0) && (P[P[k]] > 0)
i.e., make sure that neither node k, k’s parent, nor k’s grandparent are a NULL, i.e., k has a grandparent
adopt(P[:], G[:])
par-for i=1:n
if hasGrandparent(i, P[:])
G[i] = P[P[i]]
else
G[i] = P[i]
i.e., the jump step. for every node, check if it has a grandparent. if yes, set G[node] to that grandparent, otherwise set to parent
Overall algorithm:
findRoots(P[1:n], R[1:n])
P_cur = P
P_next = empty list of size n
for i=1 to ceil(log n) }
adopt(P_cur, P_next)
P_cur = P_next
}
R = P_cur
When finding the root of a tree in parallel, why can’t you immediately terminate the search as soon as you realize some node has no grandparent (i.e., its parent is the root)?
There may be more than one root! i.e., this algorithm works on forests, not just trees.
Is the parallel root-finding algorithm work-optimal?
No! The adopt() function requires O(n) work, and it is repeated ceil(log n) times. So I guess the overall algorithm is O(n log n).
Given a linked list of size n, Wyllie’s algorithm for parallel list-ranking has work O(n log n). We can try to make this work-optimal (get it down to O(n)) by “shrinking” the list to size m < n then running Wyllie on that. This gives us work O(m log m).
What choice of m would make O(m log m) = O(n)?
n/logn
We can prove this by plugging it in:
m log m = (n/logn) log(n/logn) = (n/logn) (log n - log^2 n)
= n - (log^2n / logn) = O(n)
What is an independent set (with respect to a linked list)?
A set of nodes such that none of them are successors of each other, i.e.,
A set I of vertices such that i in I => N[i] is NOT in I
Describe the randomized algorithm for finding independent sets in parallel
Basically, flip a coin for each node (in parallel of course). Then, for each node, check if its successor has the same value (heads or tails): if so, change it (symmetry breaking)! Bam, done.
ParIndSet(N[1:n], I[:])
let C[1:n] and C_hat[1:n] be space for coins
par-for i=1 to n:
C[i] = coin flip (H or T)
C_hat = C (make a copy)
par-for i=1 to n:
if (C_hat[i] == H) and (N[i] != NULL) and (C_hat[N[i]] == H):
C[i] = T
gatherIf(1:n, C[1:n]) (where heads counts as TRUE, i.e. we gather all heads that remain after symmetry breaking)
What are the work and span of our algorithm for finding independent sets in parallel?
W(n) = O(n) (each step is O(n))
D(n) = O(log n) (each step is ultimately a par-for, which is O(log n))
Given a list of size n, how many nodes wind up in the independent set on average?
n/4
My reasoning: On average, half the nodes will be heads when you flip the coins. Then for each of those nodes that came up the heads, the probability that their successor is heads is ALSO 1/2. 1/2 * 1/2 = 1/4.
Prof’s reasoning: For each coin there are four possible outcomes for itself and it’s successor. These are HH, HT, TH, TT. In the case of HH we will change it to TH. So that leaves only one case out of four in which a node will be heads at the end of the process.
What is the (three-step) “trick” for work-optimal parallel list-scan/prefix-sum?
1) Shrink list to size m = O(n/logn)
2) Run Wyllie’s algorithm on the shrunken list: O(m log m) = O(n)
3) Restore full list and ranks
How do we shrink the list (for work-optimal list scan)?
Use the independent set!
1) Initialize our rank array like for Wyllie’s, 0 for the list head and 1 for everything else.
2) Then identify an independent set (by flipping coins and removing double heads)
3) Remove the independent set by rewiring nodes around those in the independent set (jump list!) and updating ranks accordingly, propagating the ranks of nodes in the ind set to their successors.
4) Repeat the last two steps until you have m = n / log n nodes.
5) Now, all remaining nodes should have the correct ranks! Run the above process backwards to fill in the ranks of other nodes (see lecture notes for pseudocode and more explanation!)
How many times do you have to shrink the list to get it to the desired size, O(n / log n)?
O(log log n)
Our method for finding an independent set finds a set that is roughly size n/4. After we remove this set, that leaves a set with an expected size of 3/4 * n. So we can express the expected size of the remaining set after k passes as (3/4)^k * n, and we want to find k such that:
(3/4)^k *n <= n / log n
This implies (3/4)^k <= 1 / log n
Which implies log n <= (4/3)^k
and log log n <= roughly k
Describe the sequential postorder numbering algorithm
First of all, in postorder traversal is when you visit the left subtree, then the right subtree, then the root.
The algorithm is recursive. We need the root, a vector V to hold the numbering, and an initial count v_0 (probably zero). Basically do DFS and count the nodes as you go, starting with the bottom left most node.
postorder(root, V[1:n], v_0):
v = v_0
foreach child of root:
v = postorder(child, V[:], v) + 1
V[root] = v
return v
(You should add preorder traversal when you have time)