A Developer Introduction to Big O

A Simple Guide to Complexity

Charting relative complexities

Aim

This article acts as an introduction to complexities for engineers. However, it will be most useful to developers who already have a rough grasp of the area, and are looking for more of a refresher on some of the core techniques and terminology.

Argument

‘Big O’ is a measurement of how time relates to input variables. To explain, let’s take an example. We have an algorithm that walks through an array, printing each element. If n is the size of the array, then we will take n steps. Therefore our complexity is O(n).

Let’s take another algorithm which generates all pairs of items in two arrays. Array one has length m, and array two has length n. For each element in array one we would need to look at each element in array two, meaning we would take m multiplied by n steps. The complexity is O(m * n)

The variables can be anything, we just need to define them. The aim is to measure how time scales with respect to altering their values.

There are some rules to go along with these concepts.

  • Drop constants. An O(2n) algorithm is the same as an O(n) algorithm.
  • Drop non-dominant terms. The dominant term is the term with the greatest affect on complexity. A complexity of O(n^2 + n) becomes O(n^2).

These factors define our notation as asymptotic. This is because as n gets sufficiently large, the original function will tend towards the simpler form.

The final rule is.

  • All log terms with a constant base are equivalent. A log(n) term where the base is 2, is equivalent to log(n) with any other constant base.

The graph at the start of this article is a great guide for helping decide which is the dominant term in your complexity, as well as the order of complexities. However, it is tricky to get all of them in one, clear, visual representation. The most common ones in order of speed are:

  1. Constant O(c)
  2. Logarithmic O(log(n))
  3. Linear O(n)
  4. Linearithmic O(n log(n))
  5. Polynomial O(n^c)
  6. Polynomial multiplying log O(n^c log(n)) *
  7. Exponential O(c^n)
  8. Factorial O(n!)

*This is dependent on the value of c, we have O(n^b log(n)) < O(n^c) < O(n^c log(n)) , as long as b < c.

In reality there are three types of asymptotic notation.

  1. Big-θ (Big-Theta) notation
  2. Big-O notation
  3. Big-Ω (Big-Omega) notation

To better understand these concepts we can use the below diagram.

Charting our complexity function

Big-θ (Big-Theta) is the idea that as n gets large we have the property

k_1 * f(n) < Run Time < k_2 * f(n)

Where k_1 and k_2 are some constants. Big Theta requires a degree of accuracy, in later examples we will trade this off.

Big O only worries about having an asymptotic upper bound (the purple line in the diagram). That is we have the property

Run Time < k_2 * f(n)

Using this definition we have that an algorithm that is O(n) is also O(n^2), as that still provides an upper bound on our run time. We know our run time will always be smaller than our function.

Big-Ω (Big-Omega) only provides a lower bound (the green line in the diagram). This is equivalent to

k_1 * f(n) < Run Time

The purpose of this measurement is we know our run time will always be larger than our function.

We can use these concepts to define asymptotic relationships between functions. For example we know as n increases n^k will be less than c^n, where k and c are both constants. Therefore we can say n^k is O(c^n).

Conclusion

In this brief article we have explored the definition of complexity and laid out a number of useful tools to employ it in our day to day programming.

Principal Software Engineer at the BBC