(Algorithms):
An abstract data type for online building and testing of an equivalence relation.

What it means in plain English

Union find supports 2 operations of a "world" of objects (their identity is unimportant). I use the connected component (graph theoretic) formulation here; it's easy to rephrase with equivalence relations

union(A,B)
Add an edge (undirected) to the graph between A and B.
find(A)
Find the connected component to which A belongs. That is, return an object which "represents" the component of A, in the sense that if B also belongs to the component of A then find(A)==find(B).
This data type is "online": you may freely mix the operations "union" and "find", and each "find" query will return an answer correct at the time of issuing.

Union find answers questions of the form "do I have a path from A to B in my graph right now?" very quickly. Storing the graph normally, such a query takes time linear in the number of objects. With union find, it takes the time to perform 2 finds (plus the time to check equality, typically taken to be O(1)).

Tarjan's algorithm union find with path compression achieves nearly constant amortized time complexity. The time to perform any sequence of n "union" or "find" operations is O(nβ(n)), where β(n) is a (known) function growing slower than the inverse of Ack(k,.) for any value of k. This is very slow indeed (slower than log or log*); a value of n for which β(n) > 6β(1) is unimaginably large!

A less optimal algorithm takes time O(n log n) for the same n operations, which is not nearly as good. However, if coded correctly for an application which performs many "find" operations, it may be faster!

Log in or register to write something here or to contact authors.