CS61B Textbook
  • Contributors
  • DISCLAIMER
  • 1. Introduction
    • 1.1 Your First Java Program
    • 1.2 Java Workflow
    • 1.3 Basic Java Features
    • 1.4 Exercises
  • 2. Defining and Using Classes
  • 3. References, Recursion, and Lists
  • 4. SLLists
  • 5. DLLists
  • 6. Arrays
  • 7. Testing
  • 8. ArrayList
  • 9. Inheritance I: Interface and Implementation Inheritance
  • 10. Inheritance II: Extends, Casting, Higher Order Functions
    • 10.1 Implementation Inheritance: Extends
    • 10.2 Encapsulation
    • 10.3 Casting
    • 10.4 Higher Order Functions in Java
    • 10.5 Exercises
  • 11. Inheritance III: Subtype Polymorphism, Comparators, Comparable
    • 11.1 A Review of Dynamic Method Selection
    • 11.2 Subtype Polymorphism vs Explicit Higher Order Functions
    • 11.3 Comparables
    • 11.4 Comparators
    • 11.5 Chapter Summary
    • 11.6 Exercises
  • 12. Inheritance IV: Iterators, Object Methods
    • 12.1 Lists and Sets in Java
    • 12.2 Exceptions
    • 12.3 Iteration
    • 12.4 Object Methods
    • 12.5 Chapter Summary
    • 12.6 Exercises
  • 13. Asymptotics I
    • 13.1 An Introduction to Asymptotic Analysis
    • 13.2 Runtime Characterization
    • 13.3 Checkpoint: An Exercise
    • 13.4 Asymptotic Behavior
    • 13.6 Simplified Analysis Process
    • 13.7 Big-Theta
    • 13.8 Big-O
    • 13.9 Summary
    • 13.10 Exercises
  • 14. Disjoint Sets
    • 14.1 Introduction
    • 14.2 Quick Find
    • 14.3 Quick Union
    • 14.4 Weighted Quick Union (WQU)
    • 14.5 Weighted Quick Union with Path Compression
    • 14.6 Exercises
  • 15. Asymptotics II
    • 15.1 For Loops
    • 15.2 Recursion
    • 15.3 Binary Search
    • 15.4 Mergesort
    • 15.5 Summary
    • 15.6 Exercises
  • 16. ADTs and BSTs
    • 16.1 Abstract Data Types
    • 16.2 Binary Search Trees
    • 16.3 BST Definitions
    • 16.4 BST Operations
    • 16.5 BSTs as Sets and Maps
    • 16.6 Summary
    • 16.7 Exercises
  • 17. B-Trees
    • 17.1 BST Performance
    • 17.2 Big O vs. Worst Case
    • 17.3 B-Tree Operations
    • 17.4 B-Tree Invariants
    • 17.5 B-Tree Performance
    • 17.6 Summary
    • 17.7 Exercises
  • 18. Red Black Trees
    • 18.1 Rotating Trees
    • 18.2 Creating LLRB Trees
    • 18.3 Inserting LLRB Trees
    • 18.4 Runtime Analysis
    • 18.5 Summary
    • 18.6 Exercises
  • 19. Hashing I
    • 19.1 Introduction to Hashing: Data Indexed Arrays
      • 19.1.1 A first attempt: DataIndexedIntegerSet
      • 19.1.2 A second attempt: DataIndexedWordSet
      • 19.1.3 A third attempt: DataIndexedStringSet
    • 19.2 Hash Code
    • 19.3 "Valid" & "Good" Hashcodes
    • 19.4 Handling Collisions: Linear Probing and External Chaining
    • 19.5 Resizing & Hash Table Performance
    • 19.6 Summary
    • 19.7 Exercises
  • 20. Hashing II
    • 20.1 Hash Table Recap, Default Hash Function
    • 20.2 Distribution By Other Hash Functions
    • 20.3 Contains & Duplicate Items
    • 20.4 Mutable vs. Immutable Types
  • 21. Heaps and Priority Queues
    • 21.1 Priority Queues
    • 21.2 Heaps
    • 21.3 PQ Implementation
    • 21.4 Summary
    • 21.5 Exercises
  • 22. Tree Traversals and Graphs
    • 22.1 Tree Recap
    • 22.2 Tree Traversals
    • 22.3 Graphs
    • 22.4 Graph Problems
  • 23. Graph Traversals and Implementations
    • 23.1 BFS & DFS
    • 23.2 Representing Graphs
    • 23.3 Summary
    • 23.4 Exercises
  • 24. Shortest Paths
    • 24.1 Introduction
    • 24.2 Dijkstra's Algorithm
    • 24.3 A* Algorithm
    • 24.4 Summary
    • 24.5 Exercises
  • 25. Minimum Spanning Trees
    • 25.1 MSTs and Cut Property
    • 25.2 Prim's Algorithm
    • 25.3 Kruskal's Algorithm
    • 25.4 Chapter Summary
    • 25.5 MST Exercises
  • 26. Prefix Operations and Tries
    • 26.1 Introduction to Tries
    • 26.2 Trie Implementation
    • 26.3 Trie String Operations
    • 26.4 Summary
    • 26.5 Exercises
  • 27. Software Engineering I
    • 27.1 Introduction to Software Engineering
    • 27.2 Complexity
    • 27.3 Strategic vs Tactical Programming
    • 27.4 Real World Examples
    • 27.5 Summary, Exercises
  • 28. Reductions and Decomposition
    • 28.1 Topological Sorts and DAGs
    • 28.2 Shortest Paths on DAGs
    • 28.3 Longest Path
    • 28.4 Reductions and Decomposition
    • 28.5 Exercises
  • 29. Basic Sorts
    • 29.1 The Sorting Problem
    • 29.2 Selection Sort & Heapsort
    • 29.3 Mergesort
    • 29.4 Insertion Sort
    • 29.5 Summary
    • 29.6 Exercises
  • 30. Quicksort
    • 30.1 Partitioning
    • 30.2 Quicksort Algorithm
    • 30.3 Quicksort Performance Caveats
    • 30.4 Summary
    • 30.5 Exercises
  • 31. Software Engineering II
    • 31.1 Complexity II
    • 31.2 Sources of Complexity
    • 31.3 Modular Design
    • 31.4 Teamwork
    • 31.5 Exerises
  • 32. More Quick Sort, Sorting Summary
    • 32.1 Quicksort Flavors vs. MergeSort
    • 32.2 Quick Select
    • 32.3 Stability, Adaptiveness, and Optimization
    • 32.4 Summary
    • 32.5 Exercises
  • 33. Software Engineering III
    • 33.1 Candy Crush, SnapChat, and Friends
    • 33.2 The Ledger of Harms
    • 33.3 Your Life
    • 33.4 Summary
    • 33.5 Exercises
  • 34. Sorting and Algorithmic Bounds
    • 34.1 Sorting Summary
    • 34.2 Math Problems Out of Nowhere
    • 34.3 Theoretical Bounds on Sorting
    • 34.4 Summary
    • 34.5 Exercises
  • 35. Radix Sorts
    • 35.1 Counting Sort
    • 35.2 LSD Radix Sort
    • 35.3 MSD Radix Sort
    • 35.4 Summary
    • 35.5 Exercises
  • 36. Sorting and Data Structures Conclusion
    • 36.1 Radix vs. Comparison Sorting
    • 36.2 The Just-In-Time Compiler
    • 36.3 Radix Sorting Integers
    • 36.4 Summary
    • 36.5 Exercises
  • 37. Software Engineering IV
    • 37.1 The end is near
  • 38. Compression and Complexity
    • 38.1 Introduction to Compression
    • 38.2 Prefix-free Codes
    • 38.3 Shannon-Fano Codes
    • 38.4 Huffman Coding Conceptuals
    • 38.5 Compression Theory
    • 38.6 LZW Compression
    • 38.7 Summary
    • 38.8 Exercises
  • 39. Compression, Complexity, P = NP
    • 39.1 Models of Compression
    • 39.2 Optimal Compression, Kolmogorov Complexity
    • 39.3 Space/Time-Bounded Compression
    • 39.4 P = NP
    • 39.5 Exercises
Powered by GitBook
On this page
  1. 34. Sorting and Algorithmic Bounds

34.3 Theoretical Bounds on Sorting

What is the best time we can get for sorting?

Previous34.2 Math Problems Out of NowhereNext34.4 Summary

Last updated 2 years ago

So far, the fastest sorts we have gone over have had Θ(N∗logN)\Theta(N*logN)Θ(N∗logN)as the worse case time. This begs the question; can we get a sorting algorithm that would have a better asymptotic analysis?

Let's say that we have "the algorithm ultimate comparison sort" known as TUCS which is the fastest possible comparison sort. We don't know the details of this yet-to-be-discovered algorithm as it has not been discovered yet. Let's additionally say that R(N)R(N)R(N)is the worst-case runtime of TUCS and give ourselves the challenge of finding the best Θ\ThetaΘ and OOO bounds for R(N)R(N)R(N). (This might seem a little odd since we really don't know the details of it, but we can find R(N)R(N)R(N)).

As a starting point, we can go ahead and have the OOO bound of R(N)R(N)R(N) be O(N∗logN)O(N*logN)O(N∗logN). Why? If this is the fastest sorting algorithm it must be at least just as good as the algorithms we already have! Thus it must at least be as good as Mergesort's worst case as otherwise we would have a contradiction in calling this algorithm TUCS.

What about the starting point for the Ω\OmegaΩ bound of R(N)R(N)R(N)? Since we don't quite know any details about it we can start by giving it the best possible runtime of Ω(1)\Omega(1)Ω(1). But can we make this tighter?

It turns out we can as if the algorithm sorts all of N elements, it must have gone through all of the N elements at some point of sorting it. Thus, we can arrive at a tighter bound of Ω(N)\Omega(N)Ω(N). But can we make it tighter? The answer is yes, but to see the argument for why we have to consider the following game.

A Game of Cat and Mouse (and Dog)

Let's say we place Tom the Cat, Jerry the Mouse, and Spike the Dog in opaque soundproof boxes labeled A, B, and C. We want to figure out which is which using a scale.

Let's say we knew that Box A weighs less than Box B, and that Box B weighs less than Box C. Could we tell which is which?

The answer turns out to be yes!

We can find a sorted order by just considering these two inequalities. What if Box A weighs more than Box B, and that Box B weighs more than Box C? Could we still find out which is which?

The answer turns out to be yes! So far, we have been able to solve this game with just two inequalities! Let's go ahead and try a third case scenario. Could we know which is which if Box A weighs less than Box B, but Box B weighs more than Box C?

The answer turns out to be no. It turns out to have two possible orderings:

  • a: mouse, c: cat, b: dog (sorted order: acb)

  • c: mouse, a: cat, b: dog (sorted order: cab)

So while we were on a really great streak of solving the game with only two inequalities, we will need a third to solve all possibilities of the game. If we add the inequality a < c then this problem goes away and becomes solvable.

Now that we've created this table, we can create a tree to help us solve this sorting game of cat and mouse (and dog).

But how can we make this generalizable for all sorting? We know that each leaf in this tree is a unique possible answer to how boxes should be sorted. So the number of ways the boxes can be sorted should turn out to be the number of leaves in the tree. That then begs the question of how many ways can N boxes be sorted? The answer turns out to be N!N!N! ways as there are N!N!N! permutations of a given set of elements.

So how many questions do we need to ask in order to know how to sort the boxes? We would need to find the number of levels we must go through to get our answer in the leaf. Since it's a binary tree the minimum amount levels turn out lg(N!)lg(N!)lg(N!)levels to reach a leaf. (Note that lglglg means (log base 2)).

So, using this game we have found that we must ask at least lg(N!)lg(N!)lg(N!) questions about inequalities to find a proper way to sort it. Thus our lower bound for R(N)R(N)R(N) is Ω(lg(N!))\Omega(lg(N!))Ω(lg(N!)).