Master Theorem

Tags CS161Complexity

Recurrence relationships and motivation

We define $T(n)$﻿ to be the time that a program takes to run. When dealing with divide and conquer algorithms, this $T$﻿ can be defined as a recurrent relation, like

This is reminicient of merge sort, which divides the problem into two separate parts and performs a merging operation in the order of $n$﻿. But how do we find a closed-form solution for $T(n)$﻿?

The old method: the tree

We might use the tree method, which is to look at how the function varies with $t$﻿, the depth. This depth starts at $0$﻿ and moves down to $\log_2 n$﻿. We know this because every time, we divide the problem by 2. In between, layer $t$﻿ of the tree yields $2^t$﻿ problems because we double the number of problems on every layer. Note that when $t = 0$﻿, we get $1$﻿ problem, which makes sense.

We also know that for every problem on layer $t$﻿, the size is $\frac{n}{2^t}$﻿. This is because every time, we feed in a problem that has been cut in half. Again, note how layer $0$﻿ behaves.

Putting this together, we get that in every layer $t$﻿, we get $n$﻿ problems. There are $\log n + 1$﻿ layers, so we get a $T(n) = n (\log n + 1)$﻿. Neat!

One different example

But do all recurrence relationships behave the same way? Well, let’s try $T(n) = T(n / 2) + n$﻿.

Here, every layer has one problem, and the problem is $n/2^t$﻿. When we do a geometric summation from $0 → \log n$﻿, we get that

So it’s $T(n) = 2n - 1$﻿. Interesting!

And another different example

So we saw that $T(n) = 2T(n/2) + n$﻿ behaves differently than $T(n) = T(n/2) + n$﻿. What about $T(n) = 4T(n/2) + n$﻿?

Every layer has $4^t$﻿ problems and $n/2^t$﻿ steps per problem. this yields $2^t n$﻿ problems per step. When we compute a summation, we get

Interesting!

The upshot

So what we’ve noticed here is that with three different constants in front of the recurrence relationship, we are able to get drastically different behavior. Can we unify this behavior under a general framework?

Master Theorem: generalizing the tree

Let’s define this:

Here, $C$﻿ can be any constant. We recognize that the base case may not always be trivial, but it will not depend on $n$﻿. So, $C$﻿ is used because it can be arbitrarily large. We are also using $n^d$﻿ for the polynomial term. Bear in mind that we can use $O(n^d)$﻿, but this means that masters theorem won’t hold in big Theta. It will only hold with big O.

Do we have a general formula for this??

The derivation

At layer $t$﻿, we have $a^t$﻿ problems, and the problems are of size $\frac{n}{b^t}$﻿. Each of these problems take $(\frac{n}{b^t})^d$﻿ time to do. There are $\log_b(n)$﻿ layers. Taken together, we have that the total runtime is

There are three cases possible

Case 1: @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.9/katex.min.css')$a = b^d$﻿

When $a = b^d$﻿, the summation becomes $\log_b(n)$﻿ so the final coimpilexity is on the order of $\Theta(n^d \log n)$﻿. Note that we drop the base of the log because all logs are equivalent in big Theta

Case 2: @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.9/katex.min.css')$a < b^d$﻿

When $a < b^d$﻿, it means that the summation is a geometric series. We can set a lower bound as the infinity sum

And as you can see, it is $\Theta(n^d)$﻿ because of this leading constant that doesn’t depend on $n$﻿.

Intuitively, you can also understand that the summation becomes $1$﻿ (zero exponent) and the complexity is on the order of $\Theta(n^d)$﻿. We are legally allowed to do this because we can actually represent the sum remainder as a scaling of $1$﻿ that is not dependent on $n$﻿.

Case 3: @import url('https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.16.9/katex.min.css')$a > b^d$﻿

We can also do the summation trick, although it’s easier to disregard everything except for the last term because everything is getting larger. This yields the following:

We do the classical base-exponent swap, which gets us

Now, taken together, we have

which gets us $\Theta (n^{\log_b(a)})$﻿. Again, like before, we are allowed to chop up the sum like that because we can express the sum residual as a constant scaling of $n^{\log_b(a)}$﻿.

Intuition

The basic idea is that we have two things in tension. If we have a thin tree, then the tasks on the top outweigh the tasks on the bottom branch. If we have a fat tree, then the tasks on the bottom outweigh the tasks on the top. Merge sort has a complexity that balances both, which leads to $n$﻿ steps on each layer. In other cases, however, there might not be a balance, which leads to either the first or last term taking the lead.

Pendantics

We can assume that $n$﻿ divides evenly at every time step, which is not really true. However, it can be proven that it doesn’t really matter (to do: CLRS 4.6.2)

When does it fail?

The classical failure mode is when the subproblems are different sizes, like $T(n) = T(n/2) + T(n/3) + O(n^2)$﻿