We've started playing a new board game. An old board game, actually; it seems like it's originally from 1975 and has gone through a few iterations.
221B Baker Street comes with a deck of 75 unique, original Sherlock Holmes mysteries and a dense book of clues. At the start of the game, one mystery is selected at random from the deck and read out to everybody. Players then start at Baker Street on this board and roll dice to travel around the board to other locations, such as the museum, the theatre, Scotland Yard, the docks and the Boar's Head pub. There are 15 locations in all. At each location, there is a unique clue. When a player reaches the location, they open the clue book, secretly read that single clue, take notes, and then put the book back. The objective is to gather enough clues that you think you can solve the mystery... then swiftly return to Baker Street, announce your solution, and check it against the actual solution to see if you win or lose.
In my opinion, this is a really great adaptation of the Sherlock Holmes premise to a new format. The setting and the mysteries are standalone, timeless and accessible in the same way that the original Sherlock Holmes stories are. You get a new story every time and the way to succeed at the game is to invest personally in that story. There's a luck/skill/judgement element — Do you think you have enough information to guess what really happened? Was eight clues enough? Or should you visit one more location, just in case there's something critical you're missing? But what if someone else beats you back to Baker Street? Can you chance a wild guess? Have you checked the facts? Are you sure?
But what I actually want to talk about is of course the travelling salesman problem.
What is the optimal way to start at Baker Street, visit all 15 locations on the board, gathering all 15 clues, and return to Baker Street?
The game is afoot!
There are a few wrinkles here.
First: there are two entrances to the docks, and three entrances to the park. After visiting those locations, you may leave via any exit. Looking at the board, you can see that, for example, the optimal route from Baker Street to Scotland Yard is actually through the park.
Second: fast travel. The carriage depot always has a clue, but it also allows you to travel instantly to any other location without rolling dice. If you don't stop to collect the carriage depot clue, you can even go straight to that other location and get that clue in your current turn. As well as further complicating the shortest route calculations, this means that the graph is asymmetric!
Third: dice. Just measuring step count isn't actually what we want here. The game is played with a single D6 which, being honest, makes navigating the game board quite excruciating. We commonly play with 2D6. And once you arrive at a location, your turn is over, you don't get to keep moving. So we need to optimise for expected number of turns, which isn't exactly the same thing as step count, especially over shorter distances.
All of this means that setting up the correct initial distances for TSP is actually quite tricky in its own right.
But after that, we need some code. I wanted an exact solution, and brute force on 16 cities (16! = 20,922,789,888,000 possible routes) was not practical. So I decided to implement the Held–Karp algorithm.
This was interestingly challenging. My initial implementation, which I won't be sharing, was extremely rough and ready. But it worked. I got an optimal TSP route for n = 16 in about 6.5 seconds. (221B Baker Street, park, hotel, carriage depot, bank, Scotland Yard, Boar's Head via carriage depot, docks, museum, theatre, tobacconist, pawnbroker, locksmith, newsagent, chemist, 221B Baker Street.)
And at this point my interest in the problem kind of split into two separate interests: interest in the game of 221B Baker Street, and interest in the code.
We still play the game. There are plenty of mysteries still to solve. But I actually don't use the optimal route all that often. It makes the game a little too predictable. Also, it commonly involves ignoring information from the mystery. For example, if an actor has been murdered, forget the park, the theatre is the first place to go. (Although, the fastest route from 221B Baker Street to the theatre is via the park and the carriage depot...) In fact, the navigation mechanic is tedious enough that sometimes we even ignore the dice and just take turns magically hopping around the board from location to location, gathering clues. This speeds things up a lot and makes the game more enjoyable, but it renders this project moot. And, of course, it's usually far more advantageous to terminate one's search for clues before visiting all 15 locations...
But separately from this, I wanted to improve my code. I wanted to turn it into a useful, reusable module, not just a rough one-off script. And especially I wanted to work on its performance.
I tinkered with the JavaScript on and off for a while. But while I was doing this, two different thoughts occurred to me. First: JavaScript is not the programming language you choose for high performance work, unless the performance you're looking for is speed of implementation of new UIs. JavaScript is a high-level interpreted language designed for manipulating the DOM. It's fine for that purpose, it's honestly pretty powerful these days and it's much nicer to use now than it was in, say, 2015. And, on the browser side, despite several challengers, it remains the only game in town. But if you have computation to do, and a free choice of programming languages, JavaScript is not the language you choose.
Second: it had been too long since I last learned a new programming language. I have been using JavaScript for server-side and client-side code in my day job, nearly exclusively, for several years now, and I chose it for this project originally because it was the fastest option for me as the implementer. It was time for something new. Something low-level and compiled, with different paradigms. Something which would make me think, and learn. I'd been wanting to learn some Rust for a long time. I will, one day. For this, though, I picked WebAssembly. Handwritten WebAssembly text.
I had never touched any assembly language before this. So, this was hugely educational and very difficult! WebAssembly is a massive gear shift. It involved learning how to think about programming all over again. I spent a lot of time in the MDN documentation re-learning how to do exceedingly basic things like adding two variables together. It is an intentionally incredibly sparse language, possibly even relative to other assembly languages. It's also a much less popular language than JavaScript, especially in its handwritten form, which means that guidance and advice is harder to find online.
Just getting anything to work was challenging. I found debugging my code very difficult. You can hook console.log
in, but there are no strings, only numbers! You certainly can't just log a chunk of memory out and try to figure out what's wrong with it. I built a battery of unit tests and re-ran them after even the slightest modification to the code.
Thinking hard about performance also became challenging. I actually ported a few learnings from my WebAssembly implementation back to the JavaScript. Most notably, in JavaScript I had been using two 2D (rectangular) arrays len[S][u]
and prev[S][u]
to store intermediate results. Here, S
is a set of cities in the form of an integer bitfield, and u
is a city number from 0 to n - 2 inclusive. But WebAssembly of course has no conception of 2D arrays (or even 1D arrays), only contiguous blocks of memory with computed offsets. A major performance jump in my JavaScript implementation came when I started using two large 1D arrays with the same computed offsets: len[(n - 1) * S + u]
and prev[(n - 1) * S + u]
.
I left the less performance-intensive parts of the algorithm as relatively unmodified JavaScript. The main purpose of the WebAssembly is to write a large amount of intermediate results out to memory at high speed. Once this is done, a simple O(n) loop written in JavaScript reads the optimal cycle out from these results — this was not worth attempting to optimise.
My current implementation solves the 221B Baker Street problem, with 16 cities, in around 32.5 milliseconds, or about 200 times faster than my first attempt.
Overall, I had a really good time. It was nice to develop some new programming muscles. You can find both my JavaScript and WebAssembly implementations of Held–Karp here, or install them as an npm package using npm install held-karp
.
But there is kind of a mildly surprising ending here.
WebAssembly makes more efficient usage of memory, and can compute TSP on one additional city compared with the JavaScript in the same amount of memory. This is as I expected.
But, somehow, it is also a few percent slower than the JavaScript?
At least, it is on my machine. I would be very curious to hear what other people find.
This was actually quite disappointing. I was expecting a pretty substantial performance improvement. That was half of the point of doing this!
There are several possible explanations for this. My WebAssembly is poorly optimised? Seems probable, but I would have hoped that even inefficient WebAssembly would be at least an order of magnitude faster than mere JavaScript. And I would love to learn more about making it faster.
WebAssembly is generally not as efficient as I thought? I heard it was meant to have near-native performance.
Or maybe... Node.js/V8/TurboFan are much more efficient for pure computation tasks than we give them credit for? I said JavaScript isn't a language you choose for performance work. This was common wisdom for a long time. Is it actually still true? What's going on here?
Discussion (7)
2024-11-17 18:08:19 by Simon:
2024-11-17 20:42:51 by mhyee:
2024-11-17 21:11:21 by qntm:
2024-11-18 04:14:45 by matmarex:
2024-11-18 10:01:59 by qntm:
2024-11-18 12:32:24 by Hansel:
2024-11-18 18:05:53 by matmarex: