Use Examples
Initialization
In the disjoint set data structure, each element starts in its own subset. This can be represented by initializing an array where each element points to itself.
Pseudo Code:
function Initialize(n):
parent = array of size n
rank = array of size n
for i from 0 to n-1:
parent[i] = i
rank[i] = 0
Example:
For a set with 5 elements (0 through 4), the initial arrays are:
parent=[0,1,2,3,4]
rank=[0,0,0,0,0]
Find Operation
The Find operation locates the representative (or root) of the subset containing a particular element. This can be optimized with path compression to flatten the structure and speed up future operations.
Pseudo Code:
function Find(x):
if parent[x] != x:
parent[x] = Find(parent[x]) # Path compression
return parent[x]
Example:
Consider the following unions have been performed:
→ Union(0, 1)
→ Union(1, 2)
The parent array might look like:
parent=[0,0,0,3,4]
To find the representative of element 2:
Find(2)→parent[2]=0
Union Operation
The Union operation merges two subsets into a single subset. Union by rank ensures that the smaller tree is always attached under the root of the larger tree, keeping the structure balanced.
Pseudo Code:
function Union(x, y):
rootX = Find(x)
rootY = Find(y)
if rootX != rootY:
if rank[rootX] > rank[rootY]:
parent[rootY] = rootX
elif rank[rootX] < rank[rootY]:
parent[rootX] = rootY
else:
parent[rootY] = rootX
rank[rootX] += 1
Example:
Continuing with the previous example, perform the union of elements 3 and 4:
→ Union(3, 4)
The updated arrays might look like:
parent=[0,0,0,3,3]
rank=[1,0,0,1,0]
Use Cases
→ Kruskal's Algorithm: Used to find the minimum spanning tree of a graph. The union-find structure helps efficiently manage the merging of connected components.
→ Network Connectivity: To determine if there is a path between two nodes in a network. The Find operation helps check if two nodes belong to the same connected component.
→ Image Processing: For segmenting an image into connected components, where each pixel can be connected to its neighbors.
→ Dynamic Equivalence: Managing dynamic connectivity in systems where connections between components are frequently added or queried.
→ Merging Accounts: In applications like social networks, where it is necessary to merge user accounts and manage connections between users dynamically.
Advanced Techniques and Optimizations
Path Compression
Path compression is an optimization technique used in the Find operation to make the tree representing each subset as flat as possible. Whenever Find is called, it flattens the structure by making every node point directly to the root. This ensures that future operations are faster.
Pseudo Code:
function Find(x):
if parent[x] != x:
parent[x] = Find(parent[x]) # Path compression
return parent[x]
Union by Rank/Size
Union by rank or size is used during the Union operation to keep the tree balanced. The idea is to always attach the smaller tree under the root of the larger tree. This helps in keeping the tree shallow, which in turn keeps the Find operations efficient.
Pseudo Code:
function Union(x, y):
rootX = Find(x)
rootY = Find(y)
if rootX != rootY:
if rank[rootX] > rank[rootY]:
parent[rootY] = rootX
elif rank[rootX] < rank[rootY]:
parent[rootX] = rootY
else:
parent[rootY] = rootX
rank[rootX] += 1