# big O notation for the rest of us

Have you ever stumbled upon something like *O(N log N)* and wondered what that means or what it is good for? I have and of course I wanted to know what that means. From the context I thought it must have something to do with algorithms and their execution times. So what did I do? Of course I went to Wikipedia and yeah I found a page about Big O notation there. But as I am not that into mathematics I didn’t understand very much. I guess so did you, else you weren’t here but already on Wikipedia ;-)

The article looks so complex although the notation is not so complex at all. It turned out for me that the Big O notation is indeed a set of formulas to describe the behaviour and, with that, the speed of algorithms. If you want, you can think of it as a rating system. It is used mostly where big amounts of data are involved, e.g. in database systems.

I’ll show you some forms of the notation, the following examples are in Ruby, but I think you will get the point even if you’re not that into Ruby. You should notice, that the notation always assumes the worst case scenario and execution could be faster under good circumstances (e.g. on early returns).

## O(1)

Algorithms marked with O(1) will always execute in the same time, the size of the dataset will not influence the execution time at all.

[code lang="ruby"]

def replace_first(ar, new_value)

ar[0] = new_value

ar

end

[/code]

## O(N)

O(N) says the algorithm will take a linear time range. So if the time is 1/1000 sec. for 1 item, it will take 1 sec. for 1000 items. An example function will iterate over the dataset like this:

[code lang="ruby"]

def greet(names)

names.each do |name|

puts "hello, #{name}"

end

end

[/code]

## O(N^{2})

O(N^{2}) means the function will perform proportionally to the square of the input data size. This often is the case if nested loops are running over the input data performing some tasks on it. A commonly known example for this is the Bubblesort, which uses two loops that iterate over the dataset.

[code lang="ruby"]

class Array

def bubblesort!

length.times do |j| # iterates over the dataset

for i in 1...(length - j) # nested iteration

if self[i] < self[i - 1]

self[i], self[i - 1] = self[i - 1], self[i]

end

end

end

self

end

end

[/code]

## O(2^{N})

On O(2^{N}) algorithms execution time will double with each new element. Beware of these as they can get very fast very costly! The example illustrates nicely why that is so, for every method call, there are two new calls to the method itself (only if n > 1):

[code lang="ruby"]

def sum(list, n)

list.map {|x| x += n}

end

def double_sum(list, n1, n2)

sum(sum(list, n1), n2)

end

[/code]

## O(log N)

As of here I think the notations were pretty easy to understand, but logarithms are not that easy to explain. So I'll show you the graph for this function first.

As you can see, the curve starts pretty steep but flattens more and more on higher values. Thats why O(log N) functions are very effective on large datasets. A very good example for this is the Binary Search algorithm, which you might have heard of.

And that's it! These are the commonly used big O notations and if you watch out you will find them just about everywhere :-)