brunomont's blog

By brunomont, history, 46 minutes ago, In English

Hello, Codeforces!

A couple of years ago, a friend of mine told me that two cyclists had passed through every single street in the center region of the city I live in (Belo Horizonte, southeastern Brazil), on the third try.

First try
Second try

As a good computer scientist, I wondered what the minimum route that passes through every street was...

This is exactly the Chinese Postman Problem. In this blog, I will tell the tale of how I was able to run every single street in the center region of my city. To get to that, we will define Chinese Postman Problem and discuss an exact solution: the Edmonds-Johnson algorithm.

Step one: creating the graph

Issue 1: what is a street?

It is not trivial to define what a street is. I ended up going off "vibes", and decided the following arbitrary rules.

  1. If cars pass through it, then it is a street (as long as it is not just a parking space);
  2. If it connects two intersections, then it is a street;
  3. If there are two parallel streets that are too close to each other (such as sidewalks on different sides of a large street), then they are the same street;
  4. Streets that "close" (its access is not always open for the public, for instance, inside parks) are not streets.

After discussing with friends, our first thought to create the graph was to download it from a Python library that downloads the map from OpenStreetMap, a free map database. However, there were issues.

Issue 2: cars vs pedestrians

When you download the map, you can set it as "car mode" or "pedestrian mode". I first tried pedestrian mode, but it included way too many paths that are not streets, such as parking lots, parks, and two sides of sidewalks for larger streets.

This would be very hard to clean up. The car graph had similar problems. When you arrive at an intersection with a car, you can not turn to every street that arrives at that intersection. To model this, the map has several nodes at the same intersection, with very complicated directed structures.

After trying a few things, the solution that best worked was not pretty. Using a website, I was able to click on the map and generate a polyline with the coordinates of the points. After that, using a simple Python script I could merge close points into one and consider adjacent points in the polyline as my graph edges.

Anyway, at the end of the day I finally had the graph to work on.

Step two: computing the minimum route

This is the algorithmic part of this post. Let's state the problem formally.

Given a weighted undirected graph $$$G$$$ with weights $$$w : E(G) \rightarrow \mathbb{R}^+$$$, we want to compute a closed walk $$$W = e_1, e_2, \dots, e_k$$$ that contains (passes through) every edge in $$$E$$$. A closed walk means that it starts and ends at the same vertex. The reason to have it as a closed walk is that the solution becomes bit more simple.

We define the cost $$$c$$$ of the closed walk as the sum of the weights of the edges in the walk.

$$$ c = \sum_{i=1}^{k} w(e_i). $$$

Out of all such walks, we want to find that of minimum cost.

The first observation is that a solution exists if, and only if, the graph is connected (assuming there are no isolated vertices). From now on, we assume the graph is connected.

Ideally, we wish to go through every edge once. Let's discuss when it is possible to do so. If it is, we know the solution is optimum, because the sum of the weights of the edges is a lower bound for the minimum cost.

$$$ c \geq \sum_{e \in E} w(e). $$$

Eulerian circuits

We call a graph eulerian if there is a walk that starts and ends at the same vertex and visits every edge exactly once.

If we want to pass through every edge exactly once, every time the walk enters a vertex, it needs to exit it. Therefore, we can only do so when the degree of every vertex is even (this includes the starting vertex, since we need to end the walk there). It turns out that this condition is sufficient.

Lemma 1: a graph is eulerian if, and only if, it is connected and every vertex has even degree.

Proof

This constructive proof gives us a polynomial time algorithm to compute such walk. With this knowledge, if all vertices of $$$G$$$ have even degree, we can solve the problem. But what if that is not the case?

Odd degree vertices

If there are vertices with odd degree, the graph is not eulerian and we must repeat edges in our walk.

Lemma 2: There is an even number of odd degree vertices

Proof

Let $$$W$$$ be an optimum walk. Consider the edges traversed by $$$W$$$ more than once. We will create a new multigraph, with every edge having multiplicity equal to the number of times it is used in $$$W$$$.

Let $$$r(e) \geq 1$$$ be the number of times edge $$$e \in E(G)$$$ is used in $$$W$$$. Define a multiset $$$F$$$:

$$$ F = \{ e \in E(G), \text{ with multiplicity equal to $$$r(e) - 1$$$} \}, $$$

that is, the repeated edges we used in $$$W$$$. Now, define the multigraph $$$H = (V(G), E(G) \cup F)$$$. From construction, we have that $$$H$$$ is an eulerian multigraph (the definition is analogous for that of a simple graph).

We will then optimize over $$$F$$$ to find the optimum walk. The goal is to find such a multiset of minimum cost that will make $$$H$$$ eulerian (that is, it will fix the parity of the degrees properly). Since $$$H$$$ is eulerian, all degrees are even, so the vertices with odd degrees in $$$F$$$ are exactly those in $$$G$$$.

Lemma 3: Any multigraph with $$$2t$$$ odd degree vertices can be decomposed into $$$t$$$ paths and some number of cycles, such that the endpoints of the paths are precisely the odd degree vertices.

Proof

We then apply this lemma for $$$F$$$. But note that cycles do not affect the parity of the degrees, so they can be deleted without changing whether $$$H$$$ is eulerian. We are then left with paths between the odd degree vertices. To minimize the cost, we must choose shortest paths. All of this leads us to the following.

Theorem 1: the optimum $$$F$$$ is precisely the optimum way to pair the odd degree vertices in $$$G$$$ with shortest paths.

Final algorithm

The algorithm (described first by Edmonds and Johnson [1]) is described by the steps below.

  1. Find the set of odd degree vertices $$$O$$$ in the graph $$$G$$$;
  2. Compute the pairwise shortest paths $$$d(i, j)$$$ between vertices of $$$O$$$;
  3. Compute the minimum cost perfect matching between vertices in $$$O$$$ using cost $$$d$$$;
  4. Recover the minimum paths associated with the perfect matching and add the edges back into the graph, creating a multigraph $$$H$$$;
  5. Compute an eulerian circuit/walk in $$$H$$$.

All steps can be done in polynomial-time, but step (3) is the most complicated by far. It requires a weighted variant of an already complicated algorithm (Weighted Blossom), but there are implementations out there in around $$$\mathcal{O}(n^2 m)$$$ time, and this will be the bottleneck of the whole procedure.

Step three: improving the route

Ok, after stealing some code for the Weighted Blossom from the internet [2], I implemented the algorithm and got an optimum route out of it. And...

n: 591
m: 1128
sum of edges: 132,608
final ans: 145,063

145 km! This is far less than the 216 km the cyclists took to do it (although these values are not really comparable; one is a theoretical distance measured on a map, and another is GPS distance over many hours, subject to errors and such), and it only has a ~13 km repeated edge cost (around 10%)!

However... notice that this route is kind of trash. It would be extremely hard to follow this with a map on hand, mainly because of how many turns it takes. It would be much better if the route had fewer turns.

It turns out that there are many optimum routes. After adding the shortest path edges back, there are a lot of eulerian circuits we could choose from. Therefore, we can apply some greedy strategy to try to get a better route.

What I did was: we will construct an eulerian circuit iteratively. When we are at some vertex, sort the outgoing edges by the angle it makes with the edge we took to get to this vertex (we want to go straight, with the smallest angle change possible). Then, greedily choose the edge that yields the smallest angle change, as long as the remaining graph remains eulerian (this can be checked by looking at the degrees).

Applying this trick, I was able to compute a much more manageable route, and even one that ends with a lap around the region, because why not?

Step four: running!

Now, we are left with the easiest part, running an ultramarathon of around 100 miles following the optimum route I calculated (lol).

I put the route into another software, and expected the actual distance to be around 160 km, still a major improvement from the "yolo" 216 km the cyclists used. I was able to convert the route into a gpx file and plug it into my watch, so it would be relatively easy to follow.

And then I was off! After 34 hours, it was done.

And here is my final art, my magnum opus:

Final thoughts

In the end, the GPS tracking gave a resulting 178 km. The fact that this is was so much higher than the expected 160 km can be explained by GPS errors and by the fact that I made some mistakes on the run, and had to trace back.

Anyways, this was very, very fun. I want to thank all my friends that participated in this adventure in various ways: Bruno brunodemattos Nogueira, Alan Prado, Bernardo bernardo_amorim Amorim, Marcelo Chaves, Marina Marques, Larissa Firace, Emerson Francisco, Flávio Abras, Davi Netto, Naldinho, Álvaro.

All of the code I used is in a very messy repo [3], if you want to check it out. Also, I made a website to visualize the final route, as well as the minimum shortest-path matching.

References

[1] Edmonds, Jack, and Ellis L. Johnson. "Matching, Euler tours and the Chinese postman." Mathematical programming 5.1 (1973): 88-124.
[2] https://judge.yosupo.jp/submission/231089
[3] https://github.com/brunomaletta/contorno

  • Vote: I like it
  • +27
  • Vote: I do not like it

»
32 minutes ago, hide # |
 
Vote: I like it +3 Vote: I do not like it

This is pretty cool, I just did this one on my own neighborhood since I wanted to walk all the streets in the minimum amount of time. but dang $$$145$$$ $$$km$$$ is crazy, mine was not even $$$5$$$. I also think that it's cool that you were able to automatically generate the graph from the map, I ended up just manually using google maps to calculate the distances between intersections (nodes).

»
23 minutes ago, hide # |
 
Vote: I like it +2 Vote: I do not like it

»
17 minutes ago, hide # |
 
Vote: I like it +2 Vote: I do not like it

That's so cool!!!!! And very crazy at the same time.