CS61B Textbook
  • Contributors
  • DISCLAIMER
  • 1. Introduction
    • 1.1 Your First Java Program
    • 1.2 Java Workflow
    • 1.3 Basic Java Features
    • 1.4 Exercises
  • 2. Defining and Using Classes
  • 3. References, Recursion, and Lists
  • 4. SLLists
  • 5. DLLists
  • 6. Arrays
  • 7. Testing
  • 8. ArrayList
  • 9. Inheritance I: Interface and Implementation Inheritance
  • 10. Inheritance II: Extends, Casting, Higher Order Functions
    • 10.1 Implementation Inheritance: Extends
    • 10.2 Encapsulation
    • 10.3 Casting
    • 10.4 Higher Order Functions in Java
    • 10.5 Exercises
  • 11. Inheritance III: Subtype Polymorphism, Comparators, Comparable
    • 11.1 A Review of Dynamic Method Selection
    • 11.2 Subtype Polymorphism vs Explicit Higher Order Functions
    • 11.3 Comparables
    • 11.4 Comparators
    • 11.5 Chapter Summary
    • 11.6 Exercises
  • 12. Inheritance IV: Iterators, Object Methods
    • 12.1 Lists and Sets in Java
    • 12.2 Exceptions
    • 12.3 Iteration
    • 12.4 Object Methods
    • 12.5 Chapter Summary
    • 12.6 Exercises
  • 13. Asymptotics I
    • 13.1 An Introduction to Asymptotic Analysis
    • 13.2 Runtime Characterization
    • 13.3 Checkpoint: An Exercise
    • 13.4 Asymptotic Behavior
    • 13.6 Simplified Analysis Process
    • 13.7 Big-Theta
    • 13.8 Big-O
    • 13.9 Summary
    • 13.10 Exercises
  • 14. Disjoint Sets
    • 14.1 Introduction
    • 14.2 Quick Find
    • 14.3 Quick Union
    • 14.4 Weighted Quick Union (WQU)
    • 14.5 Weighted Quick Union with Path Compression
    • 14.6 Exercises
  • 15. Asymptotics II
    • 15.1 For Loops
    • 15.2 Recursion
    • 15.3 Binary Search
    • 15.4 Mergesort
    • 15.5 Summary
    • 15.6 Exercises
  • 16. ADTs and BSTs
    • 16.1 Abstract Data Types
    • 16.2 Binary Search Trees
    • 16.3 BST Definitions
    • 16.4 BST Operations
    • 16.5 BSTs as Sets and Maps
    • 16.6 Summary
    • 16.7 Exercises
  • 17. B-Trees
    • 17.1 BST Performance
    • 17.2 Big O vs. Worst Case
    • 17.3 B-Tree Operations
    • 17.4 B-Tree Invariants
    • 17.5 B-Tree Performance
    • 17.6 Summary
    • 17.7 Exercises
  • 18. Red Black Trees
    • 18.1 Rotating Trees
    • 18.2 Creating LLRB Trees
    • 18.3 Inserting LLRB Trees
    • 18.4 Runtime Analysis
    • 18.5 Summary
    • 18.6 Exercises
  • 19. Hashing I
    • 19.1 Introduction to Hashing: Data Indexed Arrays
      • 19.1.1 A first attempt: DataIndexedIntegerSet
      • 19.1.2 A second attempt: DataIndexedWordSet
      • 19.1.3 A third attempt: DataIndexedStringSet
    • 19.2 Hash Code
    • 19.3 "Valid" & "Good" Hashcodes
    • 19.4 Handling Collisions: Linear Probing and External Chaining
    • 19.5 Resizing & Hash Table Performance
    • 19.6 Summary
    • 19.7 Exercises
  • 20. Hashing II
    • 20.1 Hash Table Recap, Default Hash Function
    • 20.2 Distribution By Other Hash Functions
    • 20.3 Contains & Duplicate Items
    • 20.4 Mutable vs. Immutable Types
  • 21. Heaps and Priority Queues
    • 21.1 Priority Queues
    • 21.2 Heaps
    • 21.3 PQ Implementation
    • 21.4 Summary
    • 21.5 Exercises
  • 22. Tree Traversals and Graphs
    • 22.1 Tree Recap
    • 22.2 Tree Traversals
    • 22.3 Graphs
    • 22.4 Graph Problems
  • 23. Graph Traversals and Implementations
    • 23.1 BFS & DFS
    • 23.2 Representing Graphs
    • 23.3 Summary
    • 23.4 Exercises
  • 24. Shortest Paths
    • 24.1 Introduction
    • 24.2 Dijkstra's Algorithm
    • 24.3 A* Algorithm
    • 24.4 Summary
    • 24.5 Exercises
  • 25. Minimum Spanning Trees
    • 25.1 MSTs and Cut Property
    • 25.2 Prim's Algorithm
    • 25.3 Kruskal's Algorithm
    • 25.4 Chapter Summary
    • 25.5 MST Exercises
  • 26. Prefix Operations and Tries
    • 26.1 Introduction to Tries
    • 26.2 Trie Implementation
    • 26.3 Trie String Operations
    • 26.4 Summary
    • 26.5 Exercises
  • 27. Software Engineering I
    • 27.1 Introduction to Software Engineering
    • 27.2 Complexity
    • 27.3 Strategic vs Tactical Programming
    • 27.4 Real World Examples
    • 27.5 Summary, Exercises
  • 28. Reductions and Decomposition
    • 28.1 Topological Sorts and DAGs
    • 28.2 Shortest Paths on DAGs
    • 28.3 Longest Path
    • 28.4 Reductions and Decomposition
    • 28.5 Exercises
  • 29. Basic Sorts
    • 29.1 The Sorting Problem
    • 29.2 Selection Sort & Heapsort
    • 29.3 Mergesort
    • 29.4 Insertion Sort
    • 29.5 Summary
    • 29.6 Exercises
  • 30. Quicksort
    • 30.1 Partitioning
    • 30.2 Quicksort Algorithm
    • 30.3 Quicksort Performance Caveats
    • 30.4 Summary
    • 30.5 Exercises
  • 31. Software Engineering II
    • 31.1 Complexity II
    • 31.2 Sources of Complexity
    • 31.3 Modular Design
    • 31.4 Teamwork
    • 31.5 Exerises
  • 32. More Quick Sort, Sorting Summary
    • 32.1 Quicksort Flavors vs. MergeSort
    • 32.2 Quick Select
    • 32.3 Stability, Adaptiveness, and Optimization
    • 32.4 Summary
    • 32.5 Exercises
  • 33. Software Engineering III
    • 33.1 Candy Crush, SnapChat, and Friends
    • 33.2 The Ledger of Harms
    • 33.3 Your Life
    • 33.4 Summary
    • 33.5 Exercises
  • 34. Sorting and Algorithmic Bounds
    • 34.1 Sorting Summary
    • 34.2 Math Problems Out of Nowhere
    • 34.3 Theoretical Bounds on Sorting
    • 34.4 Summary
    • 34.5 Exercises
  • 35. Radix Sorts
    • 35.1 Counting Sort
    • 35.2 LSD Radix Sort
    • 35.3 MSD Radix Sort
    • 35.4 Summary
    • 35.5 Exercises
  • 36. Sorting and Data Structures Conclusion
    • 36.1 Radix vs. Comparison Sorting
    • 36.2 The Just-In-Time Compiler
    • 36.3 Radix Sorting Integers
    • 36.4 Summary
    • 36.5 Exercises
  • 37. Software Engineering IV
    • 37.1 The end is near
  • 38. Compression and Complexity
    • 38.1 Introduction to Compression
    • 38.2 Prefix-free Codes
    • 38.3 Shannon-Fano Codes
    • 38.4 Huffman Coding Conceptuals
    • 38.5 Compression Theory
    • 38.6 LZW Compression
    • 38.7 Summary
    • 38.8 Exercises
  • 39. Compression, Complexity, P = NP
    • 39.1 Models of Compression
    • 39.2 Optimal Compression, Kolmogorov Complexity
    • 39.3 Space/Time-Bounded Compression
    • 39.4 P = NP
    • 39.5 Exercises
Powered by GitBook
On this page
  • Topological Sorting
  • Review
  1. 28. Reductions and Decomposition

28.1 Topological Sorts and DAGs

Previous28. Reductions and DecompositionNext28.2 Shortest Paths on DAGs

Last updated 2 years ago

We have covered a tremendous amount of material so far. Programming practices, using an IDE, designing data structures, asymptotic analysis, implementing a ton of different abstract data types (e.g. using a BST, Trie, or HashTable to implement a map, heaps to implement a Priority Queue), and finally algorithms on graphs.

Why is this knowledge useful?

You may have heard people say that CS 61B teaches much of what you need to solve standard interview questions at tech companies - but why do companies seek candidates with this specific knowledge?

One major reason is that many real world problems can be formulated in such a way that they're solvable with the data structures and algorithms we've learned. This chapter is about working through some tricky problems using the tools we have already learned.

Topological Sorting

Suppose we have a collection of different tasks or activities, some of which must happen before another. How do we find sort the tasks such that for each task v, the tasks that happen before v come earlier in our sort?

We can first view our collection of tasks as a graph in which each node represents a task. An edge v→w indicates that v must happen before w. Now our original problem is reduced to finding a topological sort.

Topological Sort: an ordering of a graph's vertices such that for every directed edge u→v, u comes before v in the ordering.

Question 1.1: What are some valid topological orderings of the above graph?

Answer: Valid orderings include: [D,B,A,E,C,F] and [E,D,C,B,A,F].

An important note is that it only makes sense to topological sort certain types of graphs. To see this, consider the following graph:

What is a valid topological sorting of this?

There isn't one! D comes before B but B comes before C, E, D. Since we have a cycle, topological sort is not defined. We also can't topologically sort an undirected graph since each edge in an undirected graph creates a cycle.

So topological sorts only apply to directed, acyclic (no cycles) graphs - or DAGs.

Topological Sort: an ordering of a DAG's vertices such that for every directed edge u→v, u comes before v in the ordering.

For any topological ordering, you can redraw the graph so that the vertices are all in one line. Thus, topological sort is sometimes called a linearization of the graph. For example, here's the earlier example linearized for one of the topological orderings.

Notice that the topological sort for the above DAG has to start with either D or E and must end with F or C. For this reason, D and E are called sources, and F and C are called sinks.

Topological Sorting

How can we find a topological sort? Take a moment to think of existing graph algorithms you already know could be helpful in solving this problem.

Topological Sort Algorithm:

  • Perform a DFS traversal from every vertex in the graph, not clearing markings in between traversals.

  • Record DFS postorder along the way.

  • Topological ordering is the reverse of the postorder.

Why it works: Each vertex v gets added to the end of the postorder list only after considering all descendants of v. Thus, when any v is added to the postorder list, all its descendants are already on the list. Thus reversing this list gives a topological ordering.

Since we're simply using DFS, the runtime of this is O(V+E) where V and E are the number of nodes and edges in the graph respectively.

Pseudocode

topological(DAG):
    initialize marked array
    initialize postOrder list
    for all vertices in DAG:
        if vertex is not marked:
            dfs(vertex, marked, postOrder)
    return postOrder reversed

dfs(vertex, marked, postOrder):
    marked[vertex] = true
    for neighbor of vertex:
        dfs(neighbor, marked, postOrder)
    postOrder.add(vertex)

(Out of scope) Extra question: How could we implement topological sort using BFS? Hint 1: We'd definitely need to store some extra information. Hint 2: Think about keeping track of the in-degrees of each vertex.

Solution:

  1. Calculate in-degree of all vertices.

  2. Pick any vertex �v which has in-degree of 0.

  3. Add �v to our topological sort list. Remove the vertex �v and all edges coming out of it. Decrement in-degrees of all neighbors of vertex �v by 1.

  4. Repeat steps 2 and 3 until all vertices are removed.

How can we accomplish Step 2 efficiently? We can use a min Priority Queue of vertices with priority equal to the in-degrees.

Review

  • Topological sorts are a way of linearizing Directed, Acyclic Graphs (DAGs).

  • We can find a topological sort of any DAG in O(V+E) time using DFS (or BFS).

Question 1