CS61B Textbook
  • Contributors
  • DISCLAIMER
  • 1. Introduction
    • 1.1 Your First Java Program
    • 1.2 Java Workflow
    • 1.3 Basic Java Features
    • 1.4 Exercises
  • 2. Defining and Using Classes
  • 3. References, Recursion, and Lists
  • 4. SLLists
  • 5. DLLists
  • 6. Arrays
  • 7. Testing
  • 8. ArrayList
  • 9. Inheritance I: Interface and Implementation Inheritance
  • 10. Inheritance II: Extends, Casting, Higher Order Functions
    • 10.1 Implementation Inheritance: Extends
    • 10.2 Encapsulation
    • 10.3 Casting
    • 10.4 Higher Order Functions in Java
    • 10.5 Exercises
  • 11. Inheritance III: Subtype Polymorphism, Comparators, Comparable
    • 11.1 A Review of Dynamic Method Selection
    • 11.2 Subtype Polymorphism vs Explicit Higher Order Functions
    • 11.3 Comparables
    • 11.4 Comparators
    • 11.5 Chapter Summary
    • 11.6 Exercises
  • 12. Inheritance IV: Iterators, Object Methods
    • 12.1 Lists and Sets in Java
    • 12.2 Exceptions
    • 12.3 Iteration
    • 12.4 Object Methods
    • 12.5 Chapter Summary
    • 12.6 Exercises
  • 13. Asymptotics I
    • 13.1 An Introduction to Asymptotic Analysis
    • 13.2 Runtime Characterization
    • 13.3 Checkpoint: An Exercise
    • 13.4 Asymptotic Behavior
    • 13.6 Simplified Analysis Process
    • 13.7 Big-Theta
    • 13.8 Big-O
    • 13.9 Summary
    • 13.10 Exercises
  • 14. Disjoint Sets
    • 14.1 Introduction
    • 14.2 Quick Find
    • 14.3 Quick Union
    • 14.4 Weighted Quick Union (WQU)
    • 14.5 Weighted Quick Union with Path Compression
    • 14.6 Exercises
  • 15. Asymptotics II
    • 15.1 For Loops
    • 15.2 Recursion
    • 15.3 Binary Search
    • 15.4 Mergesort
    • 15.5 Summary
    • 15.6 Exercises
  • 16. ADTs and BSTs
    • 16.1 Abstract Data Types
    • 16.2 Binary Search Trees
    • 16.3 BST Definitions
    • 16.4 BST Operations
    • 16.5 BSTs as Sets and Maps
    • 16.6 Summary
    • 16.7 Exercises
  • 17. B-Trees
    • 17.1 BST Performance
    • 17.2 Big O vs. Worst Case
    • 17.3 B-Tree Operations
    • 17.4 B-Tree Invariants
    • 17.5 B-Tree Performance
    • 17.6 Summary
    • 17.7 Exercises
  • 18. Red Black Trees
    • 18.1 Rotating Trees
    • 18.2 Creating LLRB Trees
    • 18.3 Inserting LLRB Trees
    • 18.4 Runtime Analysis
    • 18.5 Summary
    • 18.6 Exercises
  • 19. Hashing I
    • 19.1 Introduction to Hashing: Data Indexed Arrays
      • 19.1.1 A first attempt: DataIndexedIntegerSet
      • 19.1.2 A second attempt: DataIndexedWordSet
      • 19.1.3 A third attempt: DataIndexedStringSet
    • 19.2 Hash Code
    • 19.3 "Valid" & "Good" Hashcodes
    • 19.4 Handling Collisions: Linear Probing and External Chaining
    • 19.5 Resizing & Hash Table Performance
    • 19.6 Summary
    • 19.7 Exercises
  • 20. Hashing II
    • 20.1 Hash Table Recap, Default Hash Function
    • 20.2 Distribution By Other Hash Functions
    • 20.3 Contains & Duplicate Items
    • 20.4 Mutable vs. Immutable Types
  • 21. Heaps and Priority Queues
    • 21.1 Priority Queues
    • 21.2 Heaps
    • 21.3 PQ Implementation
    • 21.4 Summary
    • 21.5 Exercises
  • 22. Tree Traversals and Graphs
    • 22.1 Tree Recap
    • 22.2 Tree Traversals
    • 22.3 Graphs
    • 22.4 Graph Problems
  • 23. Graph Traversals and Implementations
    • 23.1 BFS & DFS
    • 23.2 Representing Graphs
    • 23.3 Summary
    • 23.4 Exercises
  • 24. Shortest Paths
    • 24.1 Introduction
    • 24.2 Dijkstra's Algorithm
    • 24.3 A* Algorithm
    • 24.4 Summary
    • 24.5 Exercises
  • 25. Minimum Spanning Trees
    • 25.1 MSTs and Cut Property
    • 25.2 Prim's Algorithm
    • 25.3 Kruskal's Algorithm
    • 25.4 Chapter Summary
    • 25.5 MST Exercises
  • 26. Prefix Operations and Tries
    • 26.1 Introduction to Tries
    • 26.2 Trie Implementation
    • 26.3 Trie String Operations
    • 26.4 Summary
    • 26.5 Exercises
  • 27. Software Engineering I
    • 27.1 Introduction to Software Engineering
    • 27.2 Complexity
    • 27.3 Strategic vs Tactical Programming
    • 27.4 Real World Examples
    • 27.5 Summary, Exercises
  • 28. Reductions and Decomposition
    • 28.1 Topological Sorts and DAGs
    • 28.2 Shortest Paths on DAGs
    • 28.3 Longest Path
    • 28.4 Reductions and Decomposition
    • 28.5 Exercises
  • 29. Basic Sorts
    • 29.1 The Sorting Problem
    • 29.2 Selection Sort & Heapsort
    • 29.3 Mergesort
    • 29.4 Insertion Sort
    • 29.5 Summary
    • 29.6 Exercises
  • 30. Quicksort
    • 30.1 Partitioning
    • 30.2 Quicksort Algorithm
    • 30.3 Quicksort Performance Caveats
    • 30.4 Summary
    • 30.5 Exercises
  • 31. Software Engineering II
    • 31.1 Complexity II
    • 31.2 Sources of Complexity
    • 31.3 Modular Design
    • 31.4 Teamwork
    • 31.5 Exerises
  • 32. More Quick Sort, Sorting Summary
    • 32.1 Quicksort Flavors vs. MergeSort
    • 32.2 Quick Select
    • 32.3 Stability, Adaptiveness, and Optimization
    • 32.4 Summary
    • 32.5 Exercises
  • 33. Software Engineering III
    • 33.1 Candy Crush, SnapChat, and Friends
    • 33.2 The Ledger of Harms
    • 33.3 Your Life
    • 33.4 Summary
    • 33.5 Exercises
  • 34. Sorting and Algorithmic Bounds
    • 34.1 Sorting Summary
    • 34.2 Math Problems Out of Nowhere
    • 34.3 Theoretical Bounds on Sorting
    • 34.4 Summary
    • 34.5 Exercises
  • 35. Radix Sorts
    • 35.1 Counting Sort
    • 35.2 LSD Radix Sort
    • 35.3 MSD Radix Sort
    • 35.4 Summary
    • 35.5 Exercises
  • 36. Sorting and Data Structures Conclusion
    • 36.1 Radix vs. Comparison Sorting
    • 36.2 The Just-In-Time Compiler
    • 36.3 Radix Sorting Integers
    • 36.4 Summary
    • 36.5 Exercises
  • 37. Software Engineering IV
    • 37.1 The end is near
  • 38. Compression and Complexity
    • 38.1 Introduction to Compression
    • 38.2 Prefix-free Codes
    • 38.3 Shannon-Fano Codes
    • 38.4 Huffman Coding Conceptuals
    • 38.5 Compression Theory
    • 38.6 LZW Compression
    • 38.7 Summary
    • 38.8 Exercises
  • 39. Compression, Complexity, P = NP
    • 39.1 Models of Compression
    • 39.2 Optimal Compression, Kolmogorov Complexity
    • 39.3 Space/Time-Bounded Compression
    • 39.4 P = NP
    • 39.5 Exercises
Powered by GitBook
On this page
  1. 15. Asymptotics II

15.4 Mergesort

Previous15.3 Binary SearchNext15.5 Summary

Last updated 2 years ago

In our last example, we'll analyze merge sort, another cool sorting algorithm.

First, let's remind ourselves of selection sort, which we will initially use as a building block for merge sort.

Selection sort works off two basic steps:

  • Find the smallest item among the unsorted items, move it to the front, and ‘fix’ it in place.

  • Sort the remaining unsorted/unfixed items using selection sort.

Let's introduce one other idea here: arbitrary units of time. While the exact time something will take will depend on the machine, on the particular operations, etc., we can get a general sense of time through our arbitrary units (AU).

Hold onto this thought for later analysis.

Now that we have selection sort, let's talk about merging.

Say we have two sorted arrays that we want to combine into a single big sorted array. We could append one to the other, and then re-sort it, but that doesn't make use of the fact that each individual array is already sorted. How can we use this to our advantage?

It turns out, we can merge them more quickly using the sorted property. The smallest element must be at the start of one of the two lists. So let's compare those, and put the smallest element at the start of our new list.

Now, the next smallest element has to be at the new start of one of the two lists. We can continue comparing the first two elements and moving the smallest into place until one list is empty, then copy the rest of the other list over into the end of the new list.

Selection sort is slow, and merging is fast. How do we combine these to make sorting faster?

Having two sorted arrays is a good step, but we need to put them together. Luckily, we have merge. Merge, being of linear runtime, only takes ~64 AU. So in total, splitting it in half, sorting, then merging, only takes 512 + 512 + 64 = 1088 AU. Faster than selection sorting the whole array. But how much faster?

Exercise: Show why the time is ~640AU by calculating the time to sort each sub-list and then merge them into one array.

What if we halved it again? And again? And again?

Eventually we'll reach lists of size 1. At that point, we don't even have to use selection sort, because a list with one element is already sorted.

This is the essence of merge sort:

  • If the list is size 1, return. Otherwise:

  • Mergesort the left half

  • Mergesort the right half

  • Merge the results

So what's the running time of merge sort?

  • To get the top layer: merge ~64 elements = 64 AU

  • Second layer: merge ~32 elements, twice = 64 AU

  • Third layer: ~16*4 = 64 AU

  • ...

  • The top level takes ~N AU.

  • Next level takes ~N/2 + ~N/2 = ~N.

  • One more level down: ~N/4 + ~N/4 + ~N/4 + ~N/4 = ~N.

If we analyze selection sort, we see that its runtime is Θ(N2)\Theta(N^2)Θ(N2).

Exercise: To convince yourself that selection sort has Θ(N2)\Theta(N^2)Θ(N2) runtime, work through the geometric approach (try drawing out the state of the list at every sort call), or count the operations.

If we run an N=6N=6N=6 selection sort, and the runtime is of order N2N^2N2​​, it will take ~36 AU to run. If N=64N=64N=64, it'll take ~2048 AU to run. Now we don't know if that's 2048 nanoseconds, or seconds, or years, but we can get a relative sense of the time needed for each size of NNN.

To see an animation of this idea, .

What is the runtime of the merge operation? We can use the number of "write" operations to the new list as our cost model, and count the operations. Since we have to write each element of each list only once, the runtime is Θ(N)\Theta(N)Θ(N).

We noticed earlier that doing selection sort on an N=64N=64N=64 list will take ~2048 AU. But if we sort a list half that big, N=32N=32N=32, it only takes ~512 AU. That's more than twice as fast! So making the arrays we sort smaller has big time savings.

Now, AUs aren't real units, but they're sometimes easier and more intuitive than looking at the runtime. The runtime for our split-in-half-then-merge-them sort is N+2(N2)2N+2(\frac{N}{2})^2N+2(2N​)2​​, which is about half of N2N^2N2 for selection sort. However, they are still both Θ(N2)\Theta(N^2)Θ(N2).

What if we halved the arrays again? Will it get better? Yes! If we do two layers of merges, starting with lists of size N4\frac{N}{4}4N​, the total time will be ~640 AU.

We know merge itself is order NNN, so we can start by looking at each layer of merging:

Overall runtime in AU is ~64*k, where kkk is the number of layers. Here, k=log2(64)=6k=log_{2}(64)=6k=log2​(64)=6, so the overall cost of mergesort is ~384 AU.

Now, we saw earlier that splitting up more layers was faster, but still order N2N^2N2​​. Is merge sort faster than N2N^2N2​​?

Yes! Mergesort has worst case runtime Θ(N∗log(N))\Theta(N*log(N))Θ(N∗log(N)).

Thus, total runtime is ~Nk, where kkk is the number of levels.

How many levels are there? We split the array until it is length 1, so k=log2(N)k=log_{2}(N)k=log2​(N). Thus the overall runtime is Θ(N∗log(N))\Theta(N*log(N))Θ(N∗log(N)).

Exercise: Use exact counts to argue for Θ(N∗log(N))\Theta(N*log(N))Θ(N∗log(N)). Account for cases where we cannot divide the list perfectly in half.

So is Θ(N∗log(N))\Theta(N*log(N))Θ(N∗log(N)) actually better than Θ(N2)\Theta(N^2)Θ(N2)? Yes! It turns out Θ(N∗log(N))\Theta(N*log(N))Θ(N∗log(N)) is not much slower than linear time.

timing_table_for_runtimes
go here
Mergesort basics: merging two sorted lists.
A closer look at Mergesort