The disjoint set is a very cool data structure
Disjoint set ADT
given a set
S, a disjoint set maintains a partition
Where there is a group of subsets
... U Sn
(That is: a union
of each of the subsets will produce the set S)
2. The intersection
= the null set
when i is not equal to j
The disjoint set supports two operations.
- returns the subset Si
which contains x
- merges 2 subsets containing x and y
Notes on the disjoint set
1. Usually the sets are managed by integers
. That is, objects are stored in an arbitary data structure which can be indexed with integers (usually an array
) which store the specified objects at the specified index.
2. Set names are not important. A set can be named based on any of its elements and be unique since no two subsets have the same item.
Let's say we have a set of n computers represented by a unique integer value, and we are given a set of connections between then.
1 is connnected to 5
8 - 6
2 - 10
5 - 6
2 - 6
And we want to know is one computer connected to the other? (This is assuming the connections are two way
We can solve this problem with the disjoint set by doing the following:
Put each computer in its own set
for each connection (x,y) perform union(x,y)
and for each question is a connected to b, we check the equality of the sets returned by a find(a) and find(b). That is: find(a) == find(b).
For example, is 1 connected to 10? Yes, 1 is connected to 5 is connected to 6 is connected to 2 is connected to 10. In this case our problem solving method would (after all unions) result in a single set which means all computers are connected. This problem may seem trivial
given a small number of computers but in a massive network
or any generic problem with a large number of relationships, determining a solution becomes complex
This problem is a specific example of the more generic dynamic equivalence problem
which disjoint sets are perfectly designed for solving. Note that problems to be solved by a disjoint set must
satisfy the dynamic equivalence properties