Master Theorem

TagsCS161Complexity

Recurrence relationships and motivation

We define T(n)T(n) to be the time that a program takes to run. When dealing with divide and conquer algorithms, this TT can be defined as a recurrent relation, like

T(n)=2T(n2)+n,T(1)=1T(n) = 2 T(\frac{n}{2}) + n, T(1) = 1

This is reminicient of merge sort, which divides the problem into two separate parts and performs a merging operation in the order of nn. But how do we find a closed-form solution for T(n)T(n)?

The old method: the tree

We might use the tree method, which is to look at how the function varies with tt, the depth. This depth starts at 00 and moves down to log2n\log_2 n. We know this because every time, we divide the problem by 2. In between, layer tt of the tree yields 2t2^t problems because we double the number of problems on every layer. Note that when t=0t = 0, we get 11 problem, which makes sense.

We also know that for every problem on layer tt, the size is n2t\frac{n}{2^t}. This is because every time, we feed in a problem that has been cut in half. Again, note how layer 00 behaves.

Putting this together, we get that in every layer tt, we get nn problems. There are logn+1\log n + 1 layers, so we get a T(n)=n(logn+1)T(n) = n (\log n + 1). Neat!

One different example

But do all recurrence relationships behave the same way? Well, let’s try T(n)=T(n/2)+nT(n) = T(n / 2) + n.

Here, every layer has one problem, and the problem is n/2tn/2^t. When we do a geometric summation from 0logn0 → \log n, we get that

i=0lognn2t=2n1\sum_{i=0}^{\log n}\frac{n}{2^t} = 2n - 1

So it’s T(n)=2n1T(n) = 2n - 1. Interesting!

And another different example

So we saw that T(n)=2T(n/2)+nT(n) = 2T(n/2) + n behaves differently than T(n)=T(n/2)+nT(n) = T(n/2) + n. What about T(n)=4T(n/2)+nT(n) = 4T(n/2) + n?

Every layer has 4t4^t problems and n/2tn/2^t steps per problem. this yields 2tn2^t n problems per step. When we compute a summation, we get

ni=0logn2i=n(2log(n+1)1)=n2n \sum_{i=0}^{\log n} 2^i = n (2^{\log (n +1)}-1) = n^2

Interesting!

The upshot

So what we’ve noticed here is that with three different constants in front of the recurrence relationship, we are able to get drastically different behavior. Can we unify this behavior under a general framework?

Master Theorem: generalizing the tree

Let’s define this:

T(n)=aT(nb)+nd,T(1)=CT(n) = aT(\frac{n}{b}) + n^d, T(1) = C

Here, CC can be any constant. We recognize that the base case may not always be trivial, but it will not depend on nn. So, CC is used because it can be arbitrarily large. We are also using ndn^d for the polynomial term. Bear in mind that we can use O(nd)O(n^d), but this means that masters theorem won’t hold in big Theta. It will only hold with big O.

Do we have a general formula for this??

The derivation

At layer tt, we have ata^t problems, and the problems are of size nbt\frac{n}{b^t}. Each of these problems take (nbt)d(\frac{n}{b^t})^d time to do. There are logb(n)\log_b(n) layers. Taken together, we have that the total runtime is

ndi=0logb(n)(abd)tn^d \sum_{i = 0}^{\log_b(n)} \left(\frac{a}{b^d}\right)^t

There are three cases possible

Case 1: a=bda = b^d

When a=bda = b^d, the summation becomes logb(n)\log_b(n) so the final coimpilexity is on the order of Θ(ndlogn)\Theta(n^d \log n). Note that we drop the base of the log because all logs are equivalent in big Theta

Case 2: a<bda < b^d

When a<bda < b^d, it means that the summation is a geometric series. We can set a lower bound as the infinity sum

ndi=0logb(n)(abd)tndi=0(abd)t=nd11abd=ndbdbdan^d \sum_{i = 0}^{\log_b(n)} \left(\frac{a}{b^d}\right)^t \leq n^d \sum_{i = 0}^{\infty} \left(\frac{a}{b^d}\right)^t = n^d\frac{1}{1 - \frac{a}{b^d}} = n^d\frac{b^d}{b^d - a}

And as you can see, it is Θ(nd)\Theta(n^d) because of this leading constant that doesn’t depend on nn.

Intuitively, you can also understand that the summation becomes 11 (zero exponent) and the complexity is on the order of Θ(nd)\Theta(n^d). We are legally allowed to do this because we can actually represent the sum remainder as a scaling of 11 that is not dependent on nn.

Case 3: a>bda > b^d

We can also do the summation trick, although it’s easier to disregard everything except for the last term because everything is getting larger. This yields the following:

(abd)logb(n)(\frac{a}{b^d})^{\log_b(n)}

We do the classical base-exponent swap, which gets us

nlogb(a)dn^{\log_b(a) - d}

Now, taken together, we have

ndnlogbad=nlogban^d n^{\log_b a - d} = n^{\log_b a}

which gets us Θ(nlogb(a))\Theta (n^{\log_b(a)}). Again, like before, we are allowed to chop up the sum like that because we can express the sum residual as a constant scaling of nlogb(a)n^{\log_b(a)}.

the key here is that you DON’T memorize this stuff! You should be able to derive it ad-hoc.

General master theorem

Good summary

We are not in big Theta notation because we didn’t assume that the polynomial factor was cndcn^d. Rather, in this more general case, we assumed it to be O(nd)O(n^d). In this case, master’s theorem can only give an upper bound

Intuition

The basic idea is that we have two things in tension. If we have a thin tree, then the tasks on the top outweigh the tasks on the bottom branch. If we have a fat tree, then the tasks on the bottom outweigh the tasks on the top. Merge sort has a complexity that balances both, which leads to nn steps on each layer. In other cases, however, there might not be a balance, which leads to either the first or last term taking the lead.

Pendantics

We can assume that nn divides evenly at every time step, which is not really true. However, it can be proven that it doesn’t really matter (to do: CLRS 4.6.2)

When does it fail?

The classical failure mode is when the subproblems are different sizes, like T(n)=T(n/2)+T(n/3)+O(n2)T(n) = T(n/2) + T(n/3) + O(n^2)