Min Cut Max Flow

TagsCS161Graphs

Cuts and graphs, again

Remember that a is the separation of edges that partitions the vertices into two non-empty groups. These vertices need not be connected.

s-t graphs

Say that we had a directed graph, with a special source ss with only outgoing edges and a special sink tt with only incoming edges. This is what we define as an s-t graph. The weights on the edges are capacities. We will talk more about these later.

s-t cuts

an s-t cut is a cut that separates ss from tt. We call an edge as crossing the cut if it goes from ss to tt. Note that this doesn’t include edges that go the other way. The capacity of a cut is the sum of capacities that cross the cut (again, not including things that go the other way)

Key question: can we find a cut that has minimum capacity? Historically, this has uses in the military. Given a graph of railroad connections, how many do we have to interrupt before we seal off a city? (if we interpret a greater capacity as harder to interrupt)

Flows

We can understand the capacity of each edge as the limit of transportation (like the railroad example), we can define a flow of an edge as how much of this capacity is already taken over. There are two key rules of flows

  1. The flow on an edge must be less than or equal to its capacity
  1. The flow going into a vertex must be the same as the flow going out of a vertex. This is the formalism:

The maximum flow is a flow that shuttles as much “things” from ss to tt as possible.

Max flow min cut

Here is a key weird observation that we will spend some time proving: the maximum flow is also the minimum capacity that you need to cut. We call this the max-flow min-cut theorem

Intuitively, this makes sense. In the maximum flow situation, you are limited by the bottleneck on some graph. When you want to minimize the capacity in a cut, you would be going for the bottleneck.

Lemma 1: max flow \leq min cut

If we make any s-t cut, then we separate out a group of vertices including ss, from another group of vertices including tt. Any sort of flow must go from this group SS to the group TT (capital letters denote the groups). The highest flow would be the saturation of these edges. This is the value of the cost of the cut.

So, the key upshot is that the flow across any s-t cut is at most the cost of the cut. Therefore, we conclude that at the min-cut, the maximum flow (which is just another flow) just be at most the cost of the min-cut.

How do we show equality?

Well, we can do this through the Ford Fulkerson algorithm!

Ford-Fulkerson algorithm

Our intuition is this: just take the largest possible flow at each step from ss to tt. While this might work in some cases, the greedy approach can sometimes fail to converge at the right solution, because sometimes you need to walk back on certain flow to get a net gain in flow.

Residual graph

The FF algorithm relies on the concept of a residual graph. In the image below, the left is the normal graph (with the flows marked) and the right is the residual graph.

This residual graph GfG_f has the same vertices as the original, but there is a forward + backward edge for every directed edge. The backward edges represent how much flow you need to take away to get zero flow in the original graph. The forward edges represent how much flow you need to add to maximize the flow in the original graph edge.

The key insight

if there exists a path from ss to tt in the residual graph, then we can further augment the total flow in the original graph

This is easy to understand if every edge on the path in GfG_f is a forward edge, because that increase the flow on all the edges, including the edge that begins on ss and the edge that ends on tt,.

But this is still the case if there are some backward edges. Because a path from ss to tt necessairly has an outgoing edge from ss and an incoming edge on tt. So we are increasing the net flow, because that’s how we are measuring it. Running through a backward edge means walking back on a past choice for the greater good.

The algorithm

Basically, you want to find a path in GfG_f and increase/decrease the flow by the smallest limit in the path. We call this the augmenting path

Termination and cuts (proving max-flow min-cut)

Here, we try to prove the max-flow min-cut theorem.

We can terminate the algorithm once ss and tt are no longer in the same connected component.

Let’s cut the graph into two vertices: things reachable from ss in GfG_f and things not reachable from ss, which includes tt.

All of the flow must cross this cut if we are to get to tt. But more importantly, the edges in this cut are full. Why? Because the forward edges are non-existent in GfG_f, else these components wouldn’t be disconnected!

Note that the example above made the cut at tt. This may not be the case, and some incoming edge of tt may not be completely full. In this case, the cut exists elsewhere in the graph. Furthermore, not all full edges are part of the cut. You can have two full edges in a row, and only one (if any) will be cut.

From these two points, we must conclude that the cost of the cut is the same as the maximum flow, as desired.

Dealing with multiple sources and sinks

Just make a “master” source and a “master” sink that connects to all the sources and sinks, respectively. The weights between the master and source/sinks should be \infty

Complexity of Ford Fulkerson

If you picked arbitrary paths you can actually get into a very long loop and you end up augmenting by one step at a time. No matter what, FF algorithm will terminate correctly, but some choices are slower than others.

Here’s another interesting observation: if all the capacities are integers, then all the max flows are integers. This is because we only add and subtract integers

Therefore, in the most naive case, we have the complexity as O(mf)O(m|f|), where f|f| is the maximum flow. This is because the BFS takes O(m)O(m) time, and we augment the flow by at least one each time.

Now, the problem comes on how we can design an algorithm that doesn’t depend on f|f| in its runtime

Edmonds-Karp

We can use BFS to find the nodes reachable from ss. The idea is that you always pick the shortest path. You have O(m)O(m) for a BFS, and thanks to a nice lemma, the total number of times an edge can disapear from GfG_f is at most n/2n/2. Therefore, the total runtime is O(m2n)O(m^2n), where the mm comes from the BFS, the other mm comes from the total edges, and nn comes from the previous lemma.

Application: maximum matching

Simple assignment

Say that we had a bipartite graph with students on one side and preferred outfits on the other. If we can only give one outfit to each student, how can you maximize happiness? D

Here, the easiest solution is to solve it through maximum flow, using a universal source and sink. All edges have capacity 1

The capacity of 1 from the ss make sure that each student can only take one clothing. The capacity of 1 to the tt make sure that each clothing can be assignied to only one student

Constrained assignment

There people and ice cream flavors. Each ice cream ii has kik_i scoops. Each person jj can consume at most mjm_j scoops. Each person selects a set of ice cream they want, and they also upper-bound the ice cream flavors. For example, a person can want 10 scoops but they can only tolerate 2 scoops of Chocolate.

We can represent this as a max-flow problem too! The edges from ss represent consumption limits, and the edges to tt represent the ice cream capacity. Each edge from person to ice cream has a capacity, which is the maximum they want to consume. An edge exists from person to ice cream if this person has a non-zero tolerance for it.

What problems can be solved through maximum flow?

Typically, these problems deal with “more is good” but there is a capacity on the “more”. In other words, there must be a constraint that prevents people from being totally happy, and we are trying to optimize based on this constraint.

More examples

Key takeaway: use edges to represent some sort of limit. Don’t be afraid to use \infty-weighted edges.