Final Study Flashcards
Describe the issues with having an unbalanced tree
The operations become more slow than the expected as the level of unbalance increases, with the worst case for search requiring O(n) time, and O(log n) for a balanced tree.
A plain BST does not ensure balance. The order in which keys are inserted into the tree will affect the balance.
How is balance evaluated?
With Height. Every node in the tree, the heights of the node’s left and right subtrees differs by at most 1.
The BALANCE FACTOR is used to determine whether or not the subtree rooted at that node is height balanced. It is the difference in height between the nodes right and left subtrees.
balanceFactor(N) = height(N.right) - height(N.left)
By convention, the height of a node with no subtrees is -1. This ensures that their parent node will have a balance factor of 0 if it has two children; (-1–1=0)
balanceFactor < 0 is LEFT HEAVY
balanceFactor > 0 is RIGHT HEAVY
AVL Rotations
Rebalance is required in response to an insertion or removal of an element.
Rebalancing starts from the bottom (at the point of change) and proceeds upwards towards the root.
Specifically, after every insertion into or removal from an AVL tree, the tree retraces the path it took to find the location at which to insert or remove a node upwards, towards the root. At each node encountered during this upward traversal of the tree, the AVL tree re-computes the balance factor and performs a rotation if needed:
Moves one node upwards and another downwards in order to restore balance.
Each rotation has a center and a direction.
ROTATION CENTER is the point at which the rotation is performed
LEFT ROTATION (aka R-R rotation) is counterclockwise with the center moving down and the nodes to it’s right moving upwards
RIGHT ROTATION moves clockwise, with the center moving downwards and the nodes to its left moving upwards
Determining when a single AVL rotation is necessary
N = an unbalanced node
C = n’s heavier child
C will have either a balance factor of -1, or 1
If N and C are heavy in the same direction (balance factor has the same sign for both), then a single rotation is needed around N in the opposite direction of N’s heaviness.
These are the situations needing a single rotation:
Determining when a double AVL rotation is necessary
N = an unbalanced node
C = n’s heavier child
C will have either a balance factor of -1, or 1
If N and C are heavy in the opposite directions (balance factor has opposite signs), then a double rotation is necessary.
Pointer updates for AVL single rotations
N is the center of rotation
C is the current unbalanced child of N
N moves down, opposite the direction of rotation, becoming C’s child
This causes C to effectively move into N’s place
N adopts the child of the side that it took from C
- Note that this child will be located on the opposite side of n, compared to where it was on C
Example:
In a right rotation around N, the following things will happen:
N will become the right child of its current left child C.
C’s current right child will become N’s left child.
If N has a parent PN, then C will replace N as PN’s child. Otherwise, if N was the root of the entire tree, C will replace N as the root.
Pointer updates for double rotations
N is the center of rotation
C is the current unbalanced child of N
Importantly, each rotation in a double rotation is always applied in the direction opposite to the heaviness at the rotation’s center. In other words:
To fix a left-right imbalance, we apply a left rotation around C followed by a right rotation around N.
To fix a right-left imbalance, we apply a right rotation around C followed by a left rotation around N.
The first rotation is simply ment to align imbalances on the same side, i.e., to create a L-L or a R-R imbalance
Necessary properties of an AVL node:
key - an integer that the nodes are sorted by
value - the value held by the node
height - an integer representing the height of the node
left - the left child Node
right - the right child Node
parent - the parent Node
Importantly, just as we use None to indicate when a node does not have a child, we will also use None to indicate when a node does not have a parent. Specifically, the root node of the tree will always have a None parent pointer.
rotatLeft and rotateRight
rotateLeft(n):
c ← n.right
n.right ← c.left
if n.right is not NULL:
n.right.parent ← n
c.left ← n
n.parent ← c
updateHeight(n)
updateHeight(c)
return c
And here’s the code for a right rotation around N:
rotateRight(n):
c ← n.left
n.left ← c.right
if n.left is not NULL:
n.left.parent ← n
c.right ← n
n.parent ← c
updateHeight(n)
updateHeight(c)
return c
updateHeight, avlInsert, avlRemove, and rebalance
updateHeight(n):
n.height ← MAX(height(n.left), height(n.right)) + 1
Here, n’s new height is one more than the larger of the heights of its left and right subtrees (we add 1 to account for N itself). The height() function here simply returns a node’s height, or −1 if that node is NULL.
avlInsert(tree, key, value):
insert key, value into tree like normal BST insertion
n ← newly inserted node
p ← n.parent
while p is not NULL:
rebalance(p)
p ← p.parent
avlRemove(tree, key):
remove key from tree like normal BST removal
p ← lowest modified node (e.g. parent of removed node)
while p is not NULL:
rebalance(p)
p ← p.parent
rebalance(n):
if balanceFactor(n) < -1:
if balanceFactor(n.left) > 0:
n.left ← rotateLeft(n.left)
n.left.parent ← n
newSubtreeRoot ← rotateRight(n)
newSubtreeRoot.parent ← n.parent
n.parent.left or n.parent.right ← newSubtreeRoot
else if balanceFactor(n) > 1:
if balanceFactor(n.right) < 0:
n.right ← rotateRight(n.right)
n.right.parent ← n
newSubtreeRoot ← rotateLeft(n)
newSubtreeRoot.parent ← n.parent
n.parent.left or n.parent.right ← newSubtreeRoot
else:
updateHeight(n)
The if and else if conditions check if height balance is lost at n.
The inner if statements check whether a double rotation is needed. Specifically, they check whether n’s child is imbalanced in the opposite direction of n’s imbalance. If so, the double rotation’s first rotation, around n’s child, is performed, with parent status updated appropriately.
Any time a rotation is performed, the node returned by the rotation function is used to update parent status appropriately.
Note that newSubtreeRoot will become either the left or right child of n’s old parent, replacing n.
If no rotation at all is performed (i.e., if we enter the else block), we still need to make sure to update n’s subtree height, since the tree underneath n may have changed
time complexity of AVL operations
Rebalance = O(1)
Insert and Remove operations each have an overall complexity of O(log n)
What is a priority Queue?
The priority queue is an ADT that associates a priority value with each element. The element with the highest priority is the first one dequeued. Typically, this corresponds to the element with the lowest priority value.
A heap data structure is typically a complete binary tree, in which every node’s value is less than or equal to the values of its children. This version of the heap is specifically called a minimum binary heap, or just “min heap”. We can also have a “max heap”, where each node’s value is greater than or equal to the values of its children
Typical interface of a priority queue
insert() – insert an element with a specified priority value
first() – return the element with the lowest priority value (the “first” element in the priority queue)
remove_first() – remove (and return) the element with the lowest priority value
typically implemented with the heap data structure
typically there is a priority value that is used to organize the heap, but there will usually also be some sort of corresponding data associated with the value.
Necessary to track:
- last filled position
- next open spot
Heap Operation (with Priority Queue) - Addition of a node
- Place the node into the next open spot
- Percolate the node up the heap until its priority value is less than (or greater than in the case of a max heap) both of its children.
Example with Dynamic Array implementation:
- Put the new element at the end of the array.
- Compute the inserted element’s parent index, (i − 1) / 2.
- Compare the value of the inserted element with the value of its parent.
- If the value of the parent is greater than the value of the inserted element, swap the elements in the array and repeat from step 2.
- Do not repeat if the element has reached the beginning of the array.
Time Complexity up to O(log n)
This comes from the fact that after placing the new element at the bottom of the heap, it may need to be “percolated” or “heapified” up, swapping with its parent nodes until the heap property is restored. This percolation can occur up the tree height, which is O(log n) for a heap.
Heap Operation (with Priority Queue) - Removal of a node
Removal of the root node using the remove_first() operation:
-replace the root node with the last node that was added
-percolate the new root downwards until its parent meets the priority queue criteria
Example with dynamic array implementation:
1. Remember the value of the first element in the array (to be returned later).
2. Replace the value of the first element in the array with the value of the last element, and remove the last element.
3. If the array is not empty (i.e., it started with more than one element), compute the indices of the children of the replacement element (2 * i + 1 and 2 * i + 2).
If both of these elements fall beyond the bounds of the array, we can stop here.
4. Compare the value of the replacement element with the minimum value of its two children (or possibly one child).
5. If the replacement element’s value is greater than its minimum child’s value, swap those two elements in the array, and repeat from step 3.
Heap Implementation
We can implement the complete binary tree representation of a heap using a dynamic array.
if the binary tree is not complete, big gaps would result in the dynamic array
Heap - formula for children of a node at index i, parent of a node at index i
left child = 2i + 1
right child = 2i+2
parent = (i-1)//2 (floor division)
Building a heap from an unsorted array
- Find the first non-leaf element in the array - located at n // 2 -1 (floor division)
- percolate this element down
- proceed to the next non-leaf element and repeat the percolation process
- do this until we get to the root, once it is percolated down, we will have a proper heap.