Union Find Flashcards
What are two data structures needed in Union-Find (disjoint data set)?
Data Structures Needed in Union-Find
1. Size (or Rank) Array:
* This array keeps track of the number of elements (or rank) in the set for which a node is a root.
* Helps in optimizing merging (union) by attaching smaller trees to larger trees.
2. Parent Dictionary (or Array):
* This keeps track of the parent node of each node.
* If a node is its own parent, it is the root of a set.
class UnionFind:
def __init__(self, n):
self.parent = {i: i for i in range(n)} # Each node is its own parent initially
self.size = {i: 1 for i in range(n)} # Each set starts with size 1
def find(self, node): if self.parent[node] != node: # Path compression self.parent[node] = self.find(self.parent[node]) return self.parent[node] def union(self, node1, node2): root1 = self.find(node1) root2 = self.find(node2) if root1 != root2: # Only union if they are in different sets if self.size[root1] > self.size[root2]: # Union by size self.parent[root2] = root1 self.size[root1] += self.size[root2] else: self.parent[root1] = root2 self.size[root2] += self.size[root1]
What are two methods needed in a Union-Find (disjoint data set)?
Methods of Union-Find
1. Union (merge two sets)
* Unites two nodes by making one the parent of the other.
* Uses the size or rank heuristic (Union by Size or Rank) to attach smaller trees to larger trees for efficiency.
2. Find (find the root of a node)
* Finds the root representative of a node.
* Uses path compression to make future lookups faster by directly linking nodes to the root.
class UnionFind:
def __init__(self, n):
self.parent = {i: i for i in range(n)} # Each node is its own parent initially
self.size = {i: 1 for i in range(n)} # Each set starts with size 1
def find(self, node): if self.parent[node] != node: # Path compression self.parent[node] = self.find(self.parent[node]) return self.parent[node] def union(self, node1, node2): root1 = self.find(node1) root2 = self.find(node2) if root1 != root2: # Only union if they are in different sets if self.size[root1] > self.size[root2]: # Union by size self.parent[root2] = root1 self.size[root1] += self.size[root2] else: self.parent[root1] = root2 self.size[root2] += self.size[root1]
What’s the TC of Union Find?
Time Complexity
* Find with path compression: O(α(n)), nearly constant time.
* Union with size/rank heuristic: O(α(n)), nearly constant time.
* α(n) is the inverse Ackermann function, which grows extremely slowly.
What’s SC of union find?
Space Complexity Breakdown
1. Parent Array/Dictionary:
* Stores the parent of each node → O(n)
2. Size (or Rank) Array/Dictionary:
* Stores the size or rank of each set → O(n)
Since we store at most O(n) + O(n) = O(n) additional space, the overall space complexity is:
Space Complexity: O(n)
* This is optimal and scales linearly with the number of elements.
Can Union-Find Be Space Optimized?
* Without the Size/Rank Array: We can remove the size array if we don’t use union by size/rank, but it may lead to inefficient merging, increasing the time complexity.
* Path Compression Only: With path compression, we don’t need explicit rank storage, but the parent array is still O(n).
What’s wrong with this code and how to debug it?
class Union: def \_\_init\_\_(self, n): self.rank = [1 for i in range(n)] self.root = [i for i in range(n)] def union(self, a, b): root_a = self.find(a) root_b = self.find(b) # print(f"root of {a} is {root_a}") # print(f"root of {b} is {root_b}") if root_a == root_b: return False if self.rank[root_a] < self.rank[root_b]: self.rank[root_b] += self.rank[root_a] self.root[root_a] = root_b else: self.rank[root_a] += self.rank[root_b] self.root[root_b] = root_a return True def find(self, node): if self.root[node] != node: node = self.root[node] self.find(node) return node class Solution: def validTree(self, n: int, edges: List[List[int]]) -> bool: if len(edges) != n - 1: return False union = Union(n) for a, b in edges: #print(union.root) if not union.union(a, b): return False #print(union.root) return True
- the recursion inside def find(self, node) need to :
self.find(node) if self.root[node] != node:
- to debug it, we can add print
– inside methods
– inside loops
– at the beginning and end of the code block - No need to add self arguments when we call a method of an instance