In this writeup, I assume that the name of the array to be sorted is A, that
it is N entries long, that the entries are indexed 0 to (N - 1)
inclusive, that element number i of the array is referenced by
A{i}, and that the array is to be sorted so for all integers
i and j between 0 and (N - 1) incusive,
(A{i} ≤ A{j}) iff (i ≤
j) (i.e. A{0} contains the lowest element and the values of
successive entries in A are nondecreasing).
The basic idea of the quicksort is to divide and conquer the problem of sorting a
large array into sorting two smaller arrays. This process eventually reaches a base case
because an array of zero or one elements is automatically sorted. To divide the problem, a
'pivot value' is chosen. This pivot value must be between the lowest value in the array
and the highest value in the array (inclusive), it would be desirable if the pivot value
were actually the value of some element of the array (this of course automatically
satisfting the first criterion), and in fact the best possible pivot value would be the
median of the array. In general, the effort required to find the median is more than it's worth, so usually the goal is to get an approximation or a good guess as to what it is.
Once a pivot value is chosen, the array is shuffled so no value less than the pivot is at a
higher index than any value greater than the pivot, and vice-versa. This is accomplished
by initialising a 'left pointer' to 0, initialising a 'right pointer' to
(N - 1), then moving the pointers towards each other. If the value at the left
pointer is greater than the pivot, or if the left pointer crosses the right pointer, the
left pointer stops; if the value at the right pointer is less than the pivot, or the right
pointer crosses the left pointer, the right pointer stops. If the pointers stop before
they cross each other, then the values at the pointers are swapped and they are moved
towards each other again in the same fashion - eventually, they will cross (they start a finite distance apart, and each pointer moves at least one step closer each time).
After you've done all this, you know that from index 0 up to [but not including] the left pointer contains only
values less than or equal to the pivot (elements already there which were less than or equal to the pivot were skipped over by the left pointer, elements which were greater were swapped with an element less than the pivot from the right section), and from index (N - 1) down to [but not
including] the right pointer contains only values greater than or equal to the pivot (by similar logic).
Thus, when the pointers do cross, the array is now in two smaller parts such that if each
is sorted then the entire array is sorted - this is where the divide & conquer comes
into play. Code for this algorithm, using the simplest choice for a pivot value (the first
element of the array), looks like this:
define quicksort (
A,
leftmost,
rightmost) as:
if ((
rightmost -
leftmost) > 0) then
pivot :=
A{
leftmost}
left :=
leftmost
right :=
rightmost
while (
left ≤
right)
while ((
left ≤
right) ∧ (
A{
left} ≤
pivot))
left := (
left + 1)
end while
while ((
right ≥
left) ∧ (
A{
right} ≥
pivot))
right := (
right - 1)
end while
if (
left <
right) then
swap
A{
left} and
A{
right}
end if
end while
quicksort (
A,
leftmost,
right)
quicksort (
A,
left,
rightmost)
end if
end quicksort
A quicksort on the entire array is invoked by quicksort(A, 0,
(N - 1)). [Typically, a different sort is chosen when the size of the array
is small enough, but this is just example code to show the ideas behind this particular
sort so I haven't included that particular optimisation.] One problem with this
algorithm is that it gives very bad run-times when used with a sorted or nearly-sorted
array. A better pivot point, in those cases, would be the middle value in the array.
Another approach is to take the first, middle, and last elements in the array, sort them in
place (bubble sort is entirely suitable for this part because it's only 3 elements), then
take the middle as the pivot point. Another optimisation for arrays which contain many
duplicated values is to allow the pointers to cross by more than one index. This makes the
divide of the divide and conquer even more effective and efficient. A final
optimisation is to change the tail-recursion to iteration, and to recurse on the
smaller of the two sections and iterate on the bigger.
define quicksort (
A,
leftmost,
rightmost)
while ((
rightmost -
leftmost) > 0)
mid := floor((
leftmost +
rightmost) / 2)
if (
A{
leftmost} >
A{
mid}) then
swap
A{
leftmost} and
A{
mid}
end if
if (
A{
mid} >
A{
rightmost}) then
swap
A{
mid} and
A{
rightmost}
end if
if (
A{
leftmost} >
A{
mid}) then
swap
A{
leftmost} and
A{
mid}
end if
pivot :=
A{
mid}
left :=
leftmost
right :=
rightmost
while (
left ≤
right)
while ((
left ≤
rightmost) ∧ (
A{
left} ≤
pivot))
left := (
left + 1)
end while
while ((
right ≥
leftmost) ∧ (
A{
right} ≥
pivot))
right := (
right - 1)
end while
if (
left <
right) then
swap
A{
left} and
A{
right}
end if
end while
if ((
right -
leftmost) ≤ (
rightmost -
left)) then
quicksort (
A,
leftmost,
right)
leftmost :=
left
else
quicksort (
A,
left,
rightmost)
rightmost :=
right
end if
end while
end quicksort
Note that in the above code, the expression ((N ≤ (N - 1))
∧ (A{N} ≤ pivot)) should evaluate to false, with no
error; likewise, ((-1 ≥ 0) ∧ (A{-1} ≥ pivot)) should
evaluate to false with no error, even though A{-1} and
A{N} are undefined. As for the quicksort being unstable, there
is a simple modification which will fix that. Create a second array parallel to the first,
initialised with the values of each entry equal to the index. When swapping elements in
the sort array, swap the corresponding elements in the index array. If comparing two
elements in the sort array gives equality, then compare the values in the index array.
Always sort so that when the sort array values are equal, the index array values are in
ascending order. This will make the quicksort stable. The index array can be discarded
as soon as the sort is finished. Of course, putting so strict a condition on the
ordering, the left and right pointers will never cross further than 1 place unless the
pivot winds up unmoved, in which case they will only cross by 2 entries.
define stablequicksort(
A,
leftmost,
rightmost)
i :=
leftmost
while (
i ≤
rightmost)
B{
i} :=
i
i := (
i + 1)
end while
quicksort2(
A,
leftmost,
rightmost,
B)
end stablequicksort
define quicksort2 (
A,
leftmost,
rightmost,
B)
while ((
rightmost -
leftmost) > 0)
mid := floor((
leftmost +
rightmost) / 2)
if ((
A{
leftmost} >
A{
mid}) ∨ ((
A{
leftmost} =
A{
mid}) ∧ (
B{
leftmost} >
B{
mid}))) then
swap
A{
leftmost} and
A{
mid}
swap
B{
leftmost} and
B{
mid}
end if
if ((
A{
mid} >
A{
rightmost}) ∨ ((
A{
mid} =
A{
rightmost}) ∧ (
B{
mid} >
B{
rightmost}))) then
swap
A{
mid} and
A{
rightmost}
swap
B{mid} and
B{
rightmost}
end if
if ((
A{
leftmost} >
A{
mid}) ∨ ((
A{
leftmost} =
A{
mid}) ∧ (
B{
leftmost} >
B{
mid}))) then
swap
A{
leftmost} and
A{
mid}
swap
B{
leftmost} and
B{
mid}
end if
pivotA :=
A{
mid}
pivotB :=
B{
mid}
left :=
leftmost
right :=
rightmost
while (
left ≤
right)
while ((
left ≤
rightmost) ∧ ((
A{
left} <
pivotA) ∨ ((
A{
left} =
pivotA) ∧ (
B{
left} ≤
pivotB))))
left := (
left + 1)
end while
while ((
right ≥
leftmost) ∧ ((
A{
right} >
pivotA) ∨ ((
A{
right} =
pivotA) ∧ (
B{
right} ≥
pivotB))))
right := (
right - 1)
end while
if (
left <
right) then
swap
A{
left} and
A{
right}
swap
B{
left} and
B{
right}
end if
end while
if ((
right -
leftmost) ≤ (
rightmost -
left)) then
quicksort2 (
A,
leftmost,
right,
B)
leftmost :=
left
else
quicksort2 (
A,
left,
rightmost,
B)
rightmost :=
right
end if
end while
end quicksort2