Dijkstra’s Algorithm



Suppose that we wanted to find the shortest path in a graph that had weights. What do we do?

Well, there’s a really dumb way of doing it. We just connect the nodes with lengths of string that are proportional to the values of the edge weights, and then lift the root slowly.

The reason why this works is because the string forces a shortest weighted path. But this is kinda dumb because we don’t really gain any algorithmic insight. Can we do better?

The algorithm

Let’s talk about it informally and then we will give the proof.

  1. Store a distance from the source node ss in each node. This is always an overestimate of the true distance, so we initialize all of them to infinity. We will denote this as d[u]d[u]. We know that d[u]d(s,u)d'[u] \geq d(s, u)
  1. take the node uu with the smallest distance measurement (you might use something like a heap; we’ll talk later), and then update all of its neighbors vv using this formula:
d[v]=min(d[v],d[u]+eu,v)d[v] = \min(d[v], d[u] + e_{u,v})

(where eu,ve_{u, v} is the edge weight between u,vu, v.)

Why? Well, we can say that d[v]d[v] is the length of the shortest current path, but if coming from uu proposes something better, we update this value to reflect this change.

  1. When we are done enumerating all the neighbors, we mark the uu as complete.

The algorithm box

Here, the FF is the set of nodes to explore, and the DD is the set of nodes that we are done with. Note that it’s not obvious why when we are done exploring a node, its distance doesn’t get updated again. We will look at the formalism below

Why does it work?

Theorem: at the end of the algorithm, d[v]d[v] is the actual distance d(s,v)d(s, v) .

To prove this, we prove two claims

  1. for all vv, d[v]d(s,v)d[v] \geq d(s, v)
  1. when a vertex is done exploring, d[v]=d(s,v)d[v] = d(s, v)

This proves the theorem because we know that once we are done exploring, we hit this value from (1), we know that it never goes beyond. Whats more, all vertices are done exploring at the end.

Lemma 1

If we have a shortest path from sus → u and vv lies on that path, then this is also the shortest path from svs→v.

we prove this by contradiction. If there exists a shorter path between svs→v, then we can just swap out the svs→v subpath in the original sus→u to form a shorter path. But this is a contradiction because we’ve established that the original sus→u is the shortest path.

Claim 1

This is basically saying that we are always over-estimating. This is pretty obvious from the equation

d[v]=min(d[v],d[u]+eu,v)d[v] = \min(d[v], d[u] + e_{u,v})

again, we understand that d[v]d[v] is the length of the path we have in mind, which is obviously at most the length of the shortest path.

Claim 2

This is more nuanced. We will show this by induction

Now, the inductive step. Suppose that we just added vv to the “done” pile, and we are just about to add uu to the done list. We want to show that d[u]=d(s,u)d[u] = d(s, u). We will use the term good to refer to when this situation is true.

We want to show that uu is good. Let’s do this by contradiction. Consider the true shortest path to uu. Let zz be the last vertex that is good, where zuz \neq u. Let zz’ be the node after zz, which can be uu but doesn’t have to be

Now, what can we say about this whole thing? Well, because zz is good, we know that

d[z]=d(s,z)d[z] = d(s, z)

Because zz is on the path to uu, it must be the case that

d(s,z)d(s,u)d(s, z) \leq d(s, u)

Now, as we have previously claimed, uu is not good, which means that

d(s,u)<d[u]d(s, u) < d[u]

From all of this and the knowledge that uu is not good , we know that, d[z]<d[u]d[z] < d[u].

We also know, because we picked the uu such that d[u]d[u] was the smallest and because d[z]<d[u]d[z] < d[u], then zz MUST be done! (i.e. not up for grabs anymore).

But then, if zz is done, then we’ve already updated zz’ by the rules of

d[z]d[z]+ez,z=d(s,z)+ez,z=d(s,z)d[z'] \leq d[z] + e_{z, z'} = d(s, z) + e_{z, z'} = d(s, z')

(the second part of the equality comes from the subpath thing). If there were a shorter path from ss to zz’ other than szzs→z→z’, then d(s,z)d(s, z) would have to be shorter.

However, by claim (1), we know that d[z]d(s,z)d[z’] \geq d(s, z’). As such, we are forced to conclude that d[z]=d(s,z)d[z’] = d(s, z’). What does that tell us? It tells us that zz’ is good. And this is a contradiction to what we originally claimed, which is that zz is the farthest good node, which completes the proof by contradiction

This is a general framework of the proof

  1. assume that zz is good
  1. show that it must be done
  1. show that zz’ must be good as well, which is a contradiction

Where do you start?

If the graph is SCC, then after running Djikstra’s algorithm, we will get the minimum distance between the source and all the other vertices

However, if the graph is not SCC, you will get the information for a subset of the nodes. At the end of the day, Djikstra computes the shortest distance from a source node ss, and the choice of source node can really influence the distance (and even if you are able to reach certain nodes on directed graphs.)

How fast is it?

Well, let’s think about the algorihtm in general. We needto

  1. Find the minimum value of a map of numbers
  1. Remove the minimum value of a map of numbers
  1. Update the key-value pair of a map of numbers

More specifically, we need to peel off the vertices by its minimum value, so if there are nn vertices and mm edges, we need to find/remove the minimum nn times, and update the key mm times. This because for every edge, we will be propagating some message down it

Proposed structures

If we use an array, then our final complexity is

If we us a RB tree, we get

IF we use a heap, we can have similar performance

If we use a Fibonacci heap, we can actually insert insert, find minimum, decrease key in O(1)O(1) time and delete the minimum in O(logn)O(\log n) “amortized” time, meaning that things happen over successive calls to these operations. In this case, the total runtime is O(m+nlogn)O(m + n\log n)

What’s wrong with Dijkstra?

It can’t handle negative edge weights. Why? Well, it makes one key assumption: when you pick a node XX, it means that this node is the smallest distance from the source. it also means that it is impossible to make it shorter, as there are no shorter nodes to expand from. Therefore, after you explore its neighbors, you “lock” XX.

While this is very true for positive weights, it is not true for negative weights. It may be possible that a current node YY has a higher distance but it has a negative edge between YY and XX, which may make the final distance of XX smaller.

Feel free to trace through Dijkstra’s algorithm on the graph below, starting at AA. You will see that CC gets too high of a final distance.

Challenge question

If we know that the algorithm is currently on CC, what can we tell about the upper and lower bounds of the true distance from AA? Well, we know the upper bounds from the dd. But what about the lower bounds?

This was very tricky reasoning!

Challenge question part 2

Basically, the algorithm modification is as follows: instead of picking the dd with the smallest value, pick the lowest CC that is still not “done”, and then pick the dd within that CC that has the smallest value. The reason is that you want to explore the clusters in a strict order because the edges may be negative.