Master Theorem

Tags	CS161Complexity

Recurrence relationships and motivation

We define $T(n)$ to be the time that a program takes to run. When dealing with divide and conquer algorithms, this $T$ can be defined as a recurrent relation, like

T(n) = 2 T(\frac{n}{2}) + n, T(1) = 1

This is reminicient of merge sort, which divides the problem into two separate parts and performs a merging operation in the order of $n$ . But how do we find a closed-form solution for $T(n)$ ?

The old method: the tree

We might use the tree method, which is to look at how the function varies with $t$ , the depth. This depth starts at $0$ and moves down to $\log_2 n$ . We know this because every time, we divide the problem by 2. In between, layer $t$ of the tree yields $2^t$ problems because we double the number of problems on every layer. Note that when $t = 0$ , we get $1$ problem, which makes sense.

We also know that for every problem on layer $t$ , the size is $\frac{n}{2^t}$ . This is because every time, we feed in a problem that has been cut in half. Again, note how layer $0$ behaves.

Putting this together, we get that in every layer $t$ , we get $n$ problems. There are $\log n + 1$ layers, so we get a $T(n) = n (\log n + 1)$ . Neat!

One different example

But do all recurrence relationships behave the same way? Well, let’s try $T(n) = T(n / 2) + n$ .

Here, every layer has one problem, and the problem is $n/2^t$ . When we do a geometric summation from $0 → \log n$ , we get that

\sum_{i=0}^{\log n}\frac{n}{2^t} = 2n - 1

So it’s $T(n) = 2n - 1$ . Interesting!

And another different example

So we saw that $T(n) = 2T(n/2) + n$ behaves differently than $T(n) = T(n/2) + n$ . What about $T(n) = 4T(n/2) + n$ ?

Every layer has $4^t$ problems and $n/2^t$ steps per problem. this yields $2^t n$ problems per step. When we compute a summation, we get

n \sum_{i=0}^{\log n} 2^i = n (2^{\log (n +1)}-1) = n^2

Interesting!

The upshot

So what we’ve noticed here is that with three different constants in front of the recurrence relationship, we are able to get drastically different behavior. Can we unify this behavior under a general framework?

Master Theorem: generalizing the tree

Let’s define this:

T(n) = aT(\frac{n}{b}) + n^d, T(1) = C

Here, $C$ can be any constant. We recognize that the base case may not always be trivial, but it will not depend on $n$ . So, $C$ is used because it can be arbitrarily large. We are also using $n^d$ for the polynomial term. Bear in mind that we can use $O(n^d)$ , but this means that masters theorem won’t hold in big Theta. It will only hold with big O.

Do we have a general formula for this??

The derivation

At layer $t$ , we have $a^t$ problems, and the problems are of size $\frac{n}{b^t}$ . Each of these problems take $(\frac{n}{b^t})^d$ time to do. There are $\log_b(n)$ layers. Taken together, we have that the total runtime is

n^d \sum_{i = 0}^{\log_b(n)} \left(\frac{a}{b^d}\right)^t

There are three cases possible

Case 1: $a = b^d$

When $a = b^d$ , the summation becomes $\log_b(n)$ so the final coimpilexity is on the order of $\Theta(n^d \log n)$ . Note that we drop the base of the log because all logs are equivalent in big Theta

Case 2: $a < b^d$

When $a < b^d$ , it means that the summation is a geometric series. We can set a lower bound as the infinity sum

n^d \sum_{i = 0}^{\log_b(n)} \left(\frac{a}{b^d}\right)^t \leq n^d \sum_{i = 0}^{\infty} \left(\frac{a}{b^d}\right)^t = n^d\frac{1}{1 - \frac{a}{b^d}} = n^d\frac{b^d}{b^d - a}

And as you can see, it is $\Theta(n^d)$ because of this leading constant that doesn’t depend on $n$ .

Intuitively, you can also understand that the summation becomes $1$ (zero exponent) and the complexity is on the order of $\Theta(n^d)$ . We are legally allowed to do this because we can actually represent the sum remainder as a scaling of $1$ that is not dependent on $n$ .

Case 3: $a > b^d$

We can also do the summation trick, although it’s easier to disregard everything except for the last term because everything is getting larger. This yields the following:

(\frac{a}{b^d})^{\log_b(n)}

We do the classical base-exponent swap, which gets us

n^{\log_b(a) - d}

Now, taken together, we have

n^d n^{\log_b a - d} = n^{\log_b a}

which gets us $\Theta (n^{\log_b(a)})$ . Again, like before, we are allowed to chop up the sum like that because we can express the sum residual as a constant scaling of $n^{\log_b(a)}$ .

☝

the key here is that you DON’T memorize this stuff! You should be able to derive it ad-hoc.

General master theorem

Good summary

☝

We are not in big Theta notation because we didn’t assume that the polynomial factor was

cn^d

. Rather, in this more general case, we assumed it to be

O(n^d)

. In this case, master’s theorem can only give an upper bound

Intuition

The basic idea is that we have two things in tension. If we have a thin tree, then the tasks on the top outweigh the tasks on the bottom branch. If we have a fat tree, then the tasks on the bottom outweigh the tasks on the top. Merge sort has a complexity that balances both, which leads to $n$ steps on each layer. In other cases, however, there might not be a balance, which leads to either the first or last term taking the lead.

Pendantics

We can assume that $n$ divides evenly at every time step, which is not really true. However, it can be proven that it doesn’t really matter (to do: CLRS 4.6.2)

When does it fail?

The classical failure mode is when the subproblems are different sizes, like $T(n) = T(n/2) + T(n/3) + O(n^2)$