Gale-Shapley optimal matching

TagsCS161Graphs

What is greedy?

A greedy algorithm makes a choice and never moves back. However, this is just the basic greedy approach. We can refine our greedy approach as we did in the Ford-Fulkerson algorithm. By reframing the problem, we were able to walk back on our previous assignments in order to improve flow.

In our problem today, we will be doing something similar. We want to revert any choices if there exists a better solution. This isn’t backtracking per se, because it doesn’t require enumerating all possible solutions. It’s just a smarter greedy.

The problem at hand

We used max flow to match people who had unranked preferences, with items that didn’t have preferences at all. But what if we had both people and items with ranked preferences?

The traditional setting for this problem is the hospital residency program, where doctors submit an ordered list of hospitals and hospitals submit an ordered list of doctors.

Stable matchings

Define a blocking pair if a doctor ii and a hospital jj if they are mutually higher on their ranked list than their current assignment. In terms of marriage, we would say that this is a rogue couple, as they are happier together than they are with their assigned spouses

We define a matching MM as a stable match if there are no blocking pairs

Equivalently, we think about it this way: if we are a stable match, then hospital jj must be lower on the list then ii’s current assignment, or/and doctor ii must be lower on the list than jj’s current assignment. In other words, a swap would move the list in opposite directions, or uniformly in the worse. We would never have a swap that moved up the list for both parties.

Some non-working solutions

Hungarian approach

Create a bipartite graph with the weights on the edges being some function of preferences between the doctors and hospitals. For example, you might add together the preference rankings on doctor dd of hospital hh, and hospital hh of doctor dd. We can run something called the hungarian algorithm , but does this work?

Why this doesn’t work

  1. What is the function that maps preferences to edges? This seems arbitrary and hacky
  1. Doctors are incentivized to misreport their preferences. If a doctor really wants a hospital, they would rank all other hospitals really low. Then, the hungarian algorithm must pick the one that the doctor ranked highest, or else the total cost will be high
  1. Your algorithm isn’t all-binding. If a doctor and hospital both prefer each other more outside of what the algorithm gives them, then they will pair up outside of the algorithm. So the algorithm is only good if no swaps lead to higher satisfaction. Unfortunately, we can’t guarentee this the hungarian approach

Straight greedy

A simple greedy algorithm will match all the doctor-hospital pairs that have each other as their top choice. Obviously, this step doesn’t rule out any solution, but what about the next step? Here we get stuck

The tiebreaker

Match every doctor to their favorite hospital, and if there are more than one request per hospital, then pick according to hospital preference

Why this method doesn’t work

The problem is if we have two good hospitals, but one that everyone ranks as their first choice. Then, any candidate (even one at the bottom of the hospital rankings) can pick the other good hospital. Because nobody selected this as the first choice, this suboptimal candidate will get the second good hospital, even though other people were ranked higher. And we can’t walk back on our decision, so it’s locked this way. The tl;dr is that if we make the choice upfront, we prevent second and third choices, etc, from happening accurately.

Gale Shapely

This last approach was onto something, but it wasn’t complete. What if we were allowed to walk back on our assignments?

The intuition

Try to match each doctor to their top choice. If you discover a blocking pair, switch the matching. In reality, this means the following:

  1. Have all doctors “propose” to the hospitals.
  1. Each hospital picks the candidate highest on its list
  1. Have all non-assigned doctors propose to the hospital next on their list
  1. If the hospital gets a candidate that is higher on its list, it will boot the previous candidate out and pick the new one
  1. keep on doing this until everyone is assigned

This can cause a lot of movement, but at the end, things do settle!

The pseudocode

Complexity

The input space is O(n2)O(n^2), because each doctor and hospital have a list of size nn.

We must look at O(n)O(n) doctors each “cycle” of proposals, and we iterate through O(n)O(n) cycles (lest we reach the end of the preferences list!)

Proof of correctness

There are three claims that help us prove it works

  1. At every iteration, the current match is stable WRT the assigned doctors and hospitals.

To prove this, we use contradiction. Suppose that we had a blocking pair h,dh, d. Currently, dd is assigned to hh’, and hh is assigned to dd’. By the algorithm, we know that dd picked hh’ before hh, as the doctors read from highest to lowest in their proposals. Therefore, we reach a contradiction, because a blocking pair necessarily requires hh to be higher on dd’s list, and dd to be higher on hh’s list.

  1. Once a hospital is matched, it remains matched

This is trivial from the algorithm’s definition

  1. At the end of algorithm, every doctor and hospital is matched

Let’s do a proof by contradiction. Suppose that at the end of the algorithm, there is a doctor dd and hospital hh still free. However, if we are at the end of the algorithm, dd has already tried to match hh. If hh was empty, then it would have accepted. By claim 2, it couldn’t have been de-assigned during the algorithm, so this is a contradiction.

Detecting multiple stable matchings

Here’s a quick trick

  1. try doing optimal hospital matching
  1. try doing optimal doctor matching
  1. if these differ, you have at least two different matchings
  1. If these are the same, you only have one. This is because both the doctors and hospitals are at the optimal, meaning that any swap will uniformly decrease both sides, and so it is not desired

More formalism

Look at the final stable matchings. Multiple stable matchings exist if you are able to swap two pairs and increase the rank in one of the two pairs and decrease the rank in the other one. This must be the case for both pairs you swap. In this case, the new matching is equally stable, because there is essentially a zero-sum game. We call this a rotation, and it’s actually a little more complicated than this, but that’s a general idea.

However, swapping no longer guarantees that the algorithm is doctor optimal (see below). It just guarantees stability.

Cool things about Gale Shapely

The matching returned by this algorithm is doctor-optimal. This makes sense, because the doctors actively seek out the hospitals, while the hospitals just review proposals.

As such, the order of trying out free doctors in the algorithm doesn’t matter.

Furthermore, doctors can’t gain from misreporting their preferences. This is because doctor-optimal means that the doctor gets the best possible result given the other doctors. So even if a doctor ranks the other hospitals as very low, if they are unassigned after the first cycle, they must go propose to the other hospitals. Philosophically speaking, if you get what you want, there’s no sense in faking what to you want!

However, this matching is also hospital-worst, meaning that every hospital is matched to the least-favorite doctor possible in any stable matching. This is not immediately obvious why it’s the case, but it’s because the hospital takes things as they come. Interestingly, hospitals can gain by falsely reporting their preferences! This is again because they aren’t forced to read down a list until they are assigned.