32.1 Quicksort Flavors vs. MergeSort
Quicksort Flavors: Motivation
Before we dive into the wonderful ideas of how to improve the Quicksort Algorithm, here's a review on how the Quicksort (or Partition Sort) works.
Recall, our version of Quicksort has the following properties:
Leftmost item is always the pivot
Our partitioning algorithm always preserves the relative order of <= and >= items.
Therefore, a natural question to ask is: What can we do to avoid running into the worst case for Quicksort? So it can beat MergeSort for sure? This will be the main motivation of exploration for this section.
Quicksort Flavors: Philosophies
Here are some general philosophies to follow to brainstorm ways to improve Quicksort
Philosophy 1: Randomness
Some possible reasons that the rare worst case runtime still happen in practice include:
Bad ordering: Arrays already in sorted order
Bad elements: Arrays with all duplicates
In both of the examples above, assuming we are using the strategy that the leftmost item is always the pivot, each partition of the elements around the pivot fails to reduce the size of the resulting subarrays significantly, which leads to the consequence of going through the items many many times.
One solution to the bad ordering problem is randomness, which is encoded by these two strategies below:
Pick pivots randomly.
Shuffle items before sort.
Turns out, this randomized version of Quicksort is the best sort you can get with extremely good performance.
Just like the care-free, motorcycle rider from the 1992 TV Show Renegade, this version of the Quicksort algorithm is able to thrive with a devil-may-care attitude, even without the luxurious hair.
Philosophy 2: Smarter Pivot Selection
Another approach to the problem is to choose a "good" pivot effectively, somehow.
2a. Constant time pivot pick
One of such approach is to pick a few items, and then among them, choose the "best" one as the pivot. This type of approach exemplifies a type of pivot selection procedure that is both deterministic and constant time.
However, a big drawback of such procedure is that the resulting Quicksort has a family of "dangerous inputs"---inputs that will lead to the worst case runtime or break the algorithm---that an adversary could easily generate.
2b. Linear time pivot pick
Since we are partioning multiple times, always selecting the median as the pivot seems like a good idea, because each partition will allow us to cut the size of the resulting subarrays by half.
Therefore, to improve upon the idea from part a, we can calculate the median of a given array, which can be done in linear time, then select that as the pivot.
Philosophy 3: Introspection
This philosophy simply relies on introspecting the recursive depth of our algorithm, and switch to MergeSort once we hit the depth threshold.
Although this is a reasonable approach, it is not common to use in practice.
Quicksort Flavors vs. MergeSort: Who's the best?
Candidate One: Quicksort L3S
Quicksort L3S is the Quicksort that is introduced last time, with the following properties:
Always leftmost pivot selection
Partioning
Shuffle before starting
Unfortunately, 0 : 1 with Mergesort in the lead.
Candidate Two: Quicksort LTHS
Quicksort LTHS is very similar to Quicksort L3S, except with a different partioning scheme: Tony Hoare's In-Place Partioning Scheme.
Tony Hoare's Partioning
Imagine two pointers, one at each end of the items array, walking towards each other:
Left pointer loves small items.
Right pointer loves large items.
Big idea: Walk towards each other, swapping anything they don’t like; stop when the two pointers cross each other.
Left pointer hates larger or equal items
Right pointer hates smaller or equal items
New pivot = Item at right pointer.
End result is that things on left of pivot (originally the leftmost item) are “small” and things on the right of pivot are “large”.
TH Partioning: An Intuitive Analogy
Intuitively, you can think of each pointer as a little "police" for the "less than pivot" territory and "greater than pivot" territory. And think of the pivot as the "border line" between the two territories.
Their job is to find and arrest any "suspects", or items that don't belong to their respective territories, and trade the arrested suspects with the "police" on the other side.
Specifically, the left pointer is the "police" for the "less than pivot" territory, looking for any "suspects" (items greater than or equal to the pivot) that are not supposed to be on this side of the border line. Analogously, the right pointer looks for "suspects" (items less than or equal to the pivot) in the "greater than territory".
When the two "polices" meet, mission is done as they both have sufficiently searched across their representative territories.
All this work for nothing?
Not at all! All this work with a new partioning scheme did pay off, and using the TH partitioning scheme yields a very fast Quicksort: one that is faster than MergeSort!
It's important to also note that:
Faster schemes have been found since.
Overall runtime still depends crucially on pivot selection strategy!
1 : 1 with a win for QuickSort Flavors!
Candidate Three: QuickSort PickTH
Median Identification: Linear Time
Will this improved version of Exact Median Quicksort perform better than Mergesort?
Sadly, no. And it was terrible! Likely for the following reasons:
Cost to compute medians is too high.
1 : 2 with Mergesort as the winner.
Last updated