Hi Community!

Recently, I learned the Simulated Annealing (SA) technique for the Travelling Salesman Problem (TSP) from qwerty787788's blog. I used the 2-opt method with SA and got a good score. I was wondering how to use the SA for m-TSP. Here only difference is we have m Salesman instead of one. We have to visit all the (n) cities.

I was unable to come up with the state I should keep and how to modify it. How do we precisely allocate cities to salesmen, and how and when do we decide to make a "bad" move like we do in SA? Let's assume we have m <= 100 salesman and n <= 10,000 cities. For simplicity,

Let's assume we have grid, so my idea was to allocate cities <= THRESHOLD distances to nearest salesman in such a way that all cities are allocated. This will give us m lists each lists having the cities assigned to corresponding salesman. Now we do SA on these m-lists one by one using 2-opt. This approach doesn't seem to give good results, as we fixed the city-salesman relationship before doing SA.

Some more questions I would love to get clarity on:

How do we approach such a problem in general where SA can't be done on a complete problem directly?

How do you decide the state? Some general stuff that goes into your mind when deciding the states?

How do we change the approach? For example there are two versions I encountered for m-TSP problem:

Minimize the sum of the total distance traveled by all salesmen.

Minimize the maximum total distance traveled by a salesman.

Would be happy to learn and try out similar "standard" optimization problems using SA like assignment problem, vertex cover etc, if there are any resources or problem links you have where we can test against the various other solutions in terms of score. I'm aware of AHC (Atcoder Heuristics Contest) (and Google Hashcode, which I guess is no longer there now).

Also just a bit out of context, in case someone is not aware Psyho has this twitter thread with awesome tips on heuristic problems.

I haven't tried to solve this specific problem, but you can probably add more transitions to your state. For example, with 50% probability, you try to update a random list with 2-opt. With another 50% probability, you pick a random city and move it to a random position in a random other list.

Also, if the number of cities is big (10'000), doing random moves may not work well. Instead, you can pick a random city, greedily find the best position (in all lists) for this city, and move it there.