Iterative Quicksort: removing recursion via explicit stack

Quicksort

In my previous article on quick sort I covered in detail the recursive implementation of the famous sorting algorithm. A downside of Quick sort is that it has the potential to encounter worst-case quadratic complexity, which means that in the worst case the recursive version can reach a very steep stack depth leading to high memory usage. One way to avoid this scenario is to replace the call stack with an explicit stack thus removing the recursion all together. Many library implementations of quick sort, including glibc’s qsort() do exactly this.

Removing Recursion

Every recursive algorithm has its iterative analog, and Quicksort is no different. As is the case with mergesort, the recursive version of Quicksort has a very simple algorithm which leads alot of developers to not want to tinker with its elegant simplicity.

While removing the recursion in mergesort is done via a simple nested loop, with Quicksort it is a bit more involved. because of the way the l and r parameters are determined we need a way to save them for use in the partitioning function. A simple stack data structure is used to accomplish this. Both Java and C++ have well known stack data structures as part of their standard libraries, But to use them would be over kill and lead to code bloat, but theres a solution to this, thanks to the dragons. GLIBC uses simple macro definitions to implement a lightweight stack, but you can get even more lightweight than that. A stack can be made from little more than an array and an integer pointing to the top item yielding a VERY simple and performant way to keep track of the array boundaries being sorted. The basic technique then follows:

Other optimizations

With the recursion removed there are still a few more optimizations we can use to squeeze out a few more percentage points of performance.  One such optimization is pushing the smaller array to the stack first so that the larger of the two subarrays gets sorted before the smaller. This optimization is covered in Sedgewick’s “Algorithms in C++”.

It should be noted that most optimizations that work for the recursive implementation are directly transferable to the iterative version: Insertion sorting arrays below a certain size (typically 8 – 12) being the ticket to getting the most performance you can out of Quicksort. Proper pivot selection also goes a very long way to avoiding worst case behavior: Randomized pivot selection, Median of three, and Median of Medians  can all be dropped right in. 

The following code is an example of iterative Quicksort with randomized pivot selection, small array insertion sort, and larger subarray sorted first. 

One curious implication of choosing iteration over recursion is that while implementing Introsort from recursive Quicksort is a straightforward and well-studied optimization, the same cannot be said for the iterative approach. Introsort works by switching to heapsort when the stack depth indicates that Quicksort is approaching its worst case. I do not know of any research into whether this is a worthwhile tactic for iterative Quicksort (it could be done by checking the size of the explicit stack).

Perhaps some future benchmarking is required…

Leave a Reply

Your email address will not be published. Required fields are marked *