Theorem: Any comparison sort must in the worst case make at least Ω(n log n) comparisons to sort n keys.

Proof:
A sorting algorithm, viewed abstractly, takes a sequence of keys k1, k2, ... kn, and outputs the permutation that will sort them.

For a given n, we can represent the optimal algorithm as a binary decision tree: the internal nodes are comparisons of two keys, and at the leaves are the correct permutations. Sorting a particular sequence is equivalent to walking the tree from the root to a leaf. The number of comparisons in the worst case is the height of the tree.

Now, there are n! permutations of a given sequence, so the tree has n! leaves. Then its height must be at least ceil(log n!). But we know from Stirling's Formula that n! ≥ nne-n, which is equivalent to log(n!) ≥ n log n - n log e. So worst case ≥ ceil(log n!) ≥ n log n - n log e = &Omega(n log n). ∎

One way of thinking about this result is from an information theory viewpoint: there are n! possible permutations of the sequence, so the scrambled version can contain up to log(n!) bits of entropy. But by comparing two keys, we can extract at most one bit of information.

A counterexample for sorts that are not comparison-based: radix sort runs in O(n) time, if we restrict the keys to a certain interval.

Just a little nitpicking, the decision tree based proof is a lower bound proof, thus the proper notation above should be Ω(n log n). To prove that the *problem* of sorting is O(n log n), we can just look at one algortihm such as merge sort, and show that it is O(n log n) in the worst case.

Proving the lower bound is more involved, as shown in the writeup above. As a consequence of knowing that the problem of sorting is O(n log n) and Ω(n log n), we know that the problem of sorting is Θ(n log n).

Again, as above a caveat. The proof is only for comparison-based sorting. Bin sort (also known as Bucket Sort) is Θ(n), but you need to reserve space for all possible keys.

Log in or register to write something here or to contact authors.