quicksort (idea) by hobyrne

In this writeup, I assume that the name of the array to be sorted is A, that it is N entries long, that the entries are indexed 0 to (N - 1) inclusive, that element number i of the array is referenced by A{i}, and that the array is to be sorted so for all integers i and j between 0 and (N - 1) incusive, (A{i} ≤ A{j}) iff (i ≤ j) (i.e. A{0} contains the lowest element and the values of successive entries in A are nondecreasing).

The basic idea of the quicksort is to divide and conquer the problem of sorting a large array into sorting two smaller arrays. This process eventually reaches a base case because an array of zero or one elements is automatically sorted. To divide the problem, a 'pivot value' is chosen. This pivot value must be between the lowest value in the array and the highest value in the array (inclusive), it would be desirable if the pivot value were actually the value of some element of the array (this of course automatically satisfting the first criterion), and in fact the best possible pivot value would be the median of the array. In general, the effort required to find the median is more than it's worth, so usually the goal is to get an approximation or a good guess as to what it is.

Once a pivot value is chosen, the array is shuffled so no value less than the pivot is at a higher index than any value greater than the pivot, and vice-versa. This is accomplished by initialising a 'left pointer' to 0, initialising a 'right pointer' to (N - 1), then moving the pointers towards each other. If the value at the left pointer is greater than the pivot, or if the left pointer crosses the right pointer, the left pointer stops; if the value at the right pointer is less than the pivot, or the right pointer crosses the left pointer, the right pointer stops. If the pointers stop before they cross each other, then the values at the pointers are swapped and they are moved towards each other again in the same fashion - eventually, they will cross (they start a finite distance apart, and each pointer moves at least one step closer each time).

After you've done all this, you know that from index 0 up to [but not including] the left pointer contains only values less than or equal to the pivot (elements already there which were less than or equal to the pivot were skipped over by the left pointer, elements which were greater were swapped with an element less than the pivot from the right section), and from index (N - 1) down to [but not including] the right pointer contains only values greater than or equal to the pivot (by similar logic). Thus, when the pointers do cross, the array is now in two smaller parts such that if each is sorted then the entire array is sorted - this is where the divide & conquer comes into play. Code for this algorithm, using the simplest choice for a pivot value (the first element of the array), looks like this:

define quicksort (A, leftmost, rightmost) as:
  if ((rightmost - leftmost) > 0) then
    pivot := A{leftmost}
    left := leftmost
    right := rightmost
    while (left ≤ right)
      while ((left ≤ right) ∧ (A{left} ≤ pivot))
        left := (left + 1)
      end while
      while ((right ≥ left) ∧ (A{right} ≥ pivot))
        right := (right - 1)
      end while
      if (left < right) then
        swap A{left} and A{right}
      end if
    end while
    quicksort (A, leftmost, right)
    quicksort (A, left, rightmost)
  end if
end quicksort

A quicksort on the entire array is invoked by quicksort(A, 0, (N - 1)). [Typically, a different sort is chosen when the size of the array is small enough, but this is just example code to show the ideas behind this particular sort so I haven't included that particular optimisation.] One problem with this algorithm is that it gives very bad run-times when used with a sorted or nearly-sorted array. A better pivot point, in those cases, would be the middle value in the array. Another approach is to take the first, middle, and last elements in the array, sort them in place (bubble sort is entirely suitable for this part because it's only 3 elements), then take the middle as the pivot point. Another optimisation for arrays which contain many duplicated values is to allow the pointers to cross by more than one index. This makes the divide of the divide and conquer even more effective and efficient. A final optimisation is to change the tail-recursion to iteration, and to recurse on the smaller of the two sections and iterate on the bigger.

define quicksort (A, leftmost, rightmost)
  while ((rightmost - leftmost) > 0)
    mid := floor((leftmost + rightmost) / 2)
    if (A{leftmost} > A{mid}) then
      swap A{leftmost} and A{mid}
    end if
    if (A{mid} > A{rightmost}) then
      swap A{mid} and A{rightmost}
    end if
    if (A{leftmost} > A{mid}) then
      swap A{leftmost} and A{mid}
    end if
    pivot := A{mid}
    left := leftmost
    right := rightmost
    while (left ≤ right)
      while ((left ≤ rightmost) ∧ (A{left} ≤ pivot))
        left := (left + 1)
      end while
      while ((right ≥ leftmost) ∧ (A{right} ≥ pivot))
        right := (right - 1)
      end while
      if (left < right) then
        swap A{left} and A{right}
      end if
    end while
    if ((right - leftmost) ≤ (rightmost - left)) then
      quicksort (A, leftmost, right)
      leftmost := left
    else
      quicksort (A, left, rightmost)
      rightmost := right
    end if
  end while
end quicksort

Note that in the above code, the expression ((N ≤ (N - 1)) ∧ (A{N} ≤ pivot)) should evaluate to false, with no error; likewise, ((-1 ≥ 0) ∧ (A{-1} ≥ pivot)) should evaluate to false with no error, even though A{-1} and A{N} are undefined. As for the quicksort being unstable, there is a simple modification which will fix that. Create a second array parallel to the first, initialised with the values of each entry equal to the index. When swapping elements in the sort array, swap the corresponding elements in the index array. If comparing two elements in the sort array gives equality, then compare the values in the index array. Always sort so that when the sort array values are equal, the index array values are in ascending order. This will make the quicksort stable. The index array can be discarded as soon as the sort is finished. Of course, putting so strict a condition on the ordering, the left and right pointers will never cross further than 1 place unless the pivot winds up unmoved, in which case they will only cross by 2 entries.

define stablequicksort(A, leftmost, rightmost)
  i := leftmost
  while (i ≤ rightmost)
    B{i} := i
    i := (i + 1)
  end while
  quicksort2(A, leftmost, rightmost, B)
end stablequicksort

define quicksort2 (A, leftmost, rightmost, B)
  while ((rightmost - leftmost) > 0)
    mid := floor((leftmost + rightmost) / 2)
    if ((A{leftmost} > A{mid}) ∨ ((A{leftmost} = A{mid}) ∧ (B{leftmost} > B{mid}))) then
      swap A{leftmost} and A{mid}
      swap B{leftmost} and B{mid}
    end if
    if ((A{mid} > A{rightmost}) ∨ ((A{mid} = A{rightmost}) ∧ (B{mid} > B{rightmost}))) then
      swap A{mid} and A{rightmost}
      swap B{mid} and B{rightmost}
    end if
    if ((A{leftmost} > A{mid}) ∨ ((A{leftmost} = A{mid}) ∧ (B{leftmost} > B{mid}))) then
      swap A{leftmost} and A{mid}
      swap B{leftmost} and B{mid}
    end if
    pivotA := A{mid}
    pivotB := B{mid}
    left := leftmost
    right := rightmost
    while (left ≤ right)
      while ((left ≤ rightmost) ∧ ((A{left} < pivotA) ∨ ((A{left} = pivotA) ∧ (B{left} ≤ pivotB))))
        left := (left + 1)
      end while
      while ((right ≥ leftmost) ∧ ((A{right} > pivotA) ∨ ((A{right} = pivotA) ∧ (B{right} ≥ pivotB))))
        right := (right - 1)
      end while
      if (left < right) then
        swap A{left} and A{right}
        swap B{left} and B{right}
      end if
    end while
    if ((right - leftmost) ≤ (rightmost - left)) then
      quicksort2 (A, leftmost, right, B)
      leftmost := left
    else
      quicksort2 (A, left, rightmost, B)
      rightmost := right
    end if
  end while
end quicksort2

Sorting Algorithms	Quick Sort	Peter Landin	stupid-sort
stooge sort	C.A.R. Hoare	Merge Sort	bubble sort
Heap Sort	Wet T-shirts and hot summers: a fifteen year old's definition of love	Heapsort	Cool Mint Listerine PocketPaks
stable sort	Haskell	Time complexity	Selection Sort
Insertion Sort	binary search tree	The Future Sound of London	introspection sort
SAS	Median	Big-oh notation