35.4 Summary

Terminology.

  • Radix - just another word for ‘base’ as in the base of a number system. For example, the radix for words written in lowercase English letters is 26. For number written in Arabic numerals it is 10.

  • Radix sort - a sort that works with one character at a time (by grouping objects that have the same digit in the same position).

  • Note: I will use ‘character’ and ‘digit’ interchangably in this study guide.

Counting Sort. Allows you to sort NN keys that are integers between 00 and R1R-1 in Θ(N+R)\Theta(N+R)time. Beats linearithmic lower bound by avoiding any binary compares. This is a completely different philosophy for how things should be sorted. This is the most important concept for this lecture.

LSD. In the LSD algorithm, we sort by each digit, working from right to left. Requires examination of Θ(WN)\Theta(WN)digits, where WWis the length of the longest key. Runtime is Θ(WN+WR)\Theta(WN+WR), though we usually think of RR as a constant and just say Θ(WN)\Theta(WN). The Θ(WR)\Theta(WR) part of the runtime is due to the creation fo length RR arrows for counting sort. We usually do LSD sort using counting sort as a subroutine, but it’s worth thinking about whether other sorts might work as well.

LSD vs Comparison Sorting. Our comparison sorts, despite requiring Θ(NlogN)\Theta(N*logN) time, can still be faster than LSD sort. For extremely large N, LSD sort will naturally win, but log N is typically pretty small. Know which algorithm is best in the two extreme cases of very long dissimilar strings and very long, nearly equal strings.

MSD. In MSD sorting, we work from left to right, and solve each resulting subproblem independently. Thus, for each problem, we may have as many as RR subproblem. Worst case runtime is exactly the same as LSD sort, Θ(WN+WR)\Theta(WN+WR), though can be much better. In the very best case, where we only have to look at the top character (only possible for R>NR>N), we have a runtime of Θ(N+R)\Theta(N+R).

Last updated