15.6 Big O vs. Worst Case

A short digression on asymptotics

Consider the following statements about BSTs. Which of the following are true?

  1. The worst-case height of a BST is Θ(N)\Theta(N).

  2. BST height is O(N)O(N).

  3. BST height is O(N2)O(N^2).

The answer is that all three statements are true. BSTs always have a height that is linear or better, and a linear height is obviously "less than" the quadratic upper bound in the last point.

However, a more tricky question is which of the three statements is the most informative.

The answer here is the first statement. Theta is always a more informative statement than big O.

Think about theta as similar to ==. When we say some function F(N)theta(N)F(N) \in theta(N) we mean it grows exactly linearly.

By contrast, you should think of OO as <=<=. If we say some function f(N)O(N)f(N) \in O(N), the we mean it grows at a linear rate or slower.

As a specific example, suppose we know that f(N)f(N) is one of these functions: 3N3N, 4logN+64log N + 6. If I tell you that f(N)O(N)f(N) \in O(N) you have no idea which function we're talking about. But if I say f(N)theta(N)f(N) \in theta(N), you know immediately which one we mean.

For an analogy, consider the following statements about the worst-case cost of a hotel room:

  1. The most expensive room is $639/night.

  2. The most expensive room is less than or equal to $2000/night.

Here, we see that the first statement gives us exact information, whereas the second statement does not. In the second statement, the most expensive room could be $2000, $10, or anywhere in between.

However, both are statements about the worst case. Applying this to asymptotic notation, this means that we can refer to the worst case with Θ\Theta, OO, or even Ω\Omega. Big O is not the same as the worst case!

Using Big O

If Θ\Theta is always more informative than OO, then why do we bother using Big O notation at all? There are several reasons:

  • We can make broader statements. For example, saying "binary search is O(logN)O(\log N) is correct, but saying "binary search tree is Θ(logN)\Theta(log N)" would not be correct, since it can be constant in certain scenarios.

  • Sometimes, it is not possible or extremely difficult to determine the exact runtime. In such cases, we would still like to provide a generalized upper bound.

Last updated