During a performance evaluation internally, we ran into a strange situation. Our bulk insert performance using the node.js API was significantly worse than the performance of other clients. In particular, when we compared that to the C# version, we saw that the numbers were significantly worse than expected.
To be fair, this comparison is made between our C# client, which has been through the wringer in terms of optimization and attention to performance, and the Node.js client. The focus of the Node.js client was on correctness and usability.
It isn’t fair to expect the same performance from Node.js and C#, after all. However, that difference in performance was annoying enough to make us take a deeper look into what was going on.
Here is the relevant code:
const store = new DocumentStore('http://localhost:8080', 'bulk');
store.initialize();
const bulk = store.bulkInsert();
for (let i = 0; i < 100_000_000; i++) {
await bulk.store(new User('user' + i));
}
await bulk.finish();
As you can see, the Node.js numbers are respectable. Running at a rate of over 85,000 writes per second is nothing to sneeze at.
But I also ran the exact same test with the C# client, and I got annoyed. The C# client was able to hit close to 100,000 more writes per second than the Node.js client. And in both cases, the actual limit was on the client side, not on the server side.
For fun, I ran a few clients and hit 250,000 writes/second without really doing much. The last time we properly tested ingest performance for RavenDB we achieved 150,000 writes/second. So it certainly looks like we are performing significantly better.
Going back to the Node.js version, I wanted to know what exactly was the problem that we had there. Why are we so much slower than the C# version? It’s possible that this is just the limits of the node.js platform, but you gotta check to know.
Node.js has an --inspect flag that you can use, and Chrome has a built-in profiler (chrome://inspect) that can plug into that. Using the DevTools, you can get a performance profile of a Node.js process.
I did just that and go the following numbers:
That is… curious. Really curious, isn’t it?
Basically, none of my code appears here at all, most of the time is spent dealing with the async machinery. If you look at the code above, you can see that we are issuing an await for each document stored.
The idea with bulk insert is that under the covers, we split the writing to an in-memory buffer and the flushing of the buffer to the network. In the vast majority of cases, we’ll not do any async operations in the store() call. If the buffer is full, we’ll need to flush it to the network, and that may force us to do an actual await operation. In Node.js, awaiting an async function that doesn’t actually perform any async operation appears to be super expensive.
We threw around a bunch of ideas on how to resolve this issue. The problem is that Node.js has no equivalent to C#’s ValueTask. We also have a lot of existing code out there in the field that we must remain compatible with.
Our solution to this dilemma was to add another function that you can call, like so:
for (let i = 0; i < 100_000_000; i++) {
const user = new User('user' + i);
const id = "users/" + i;
if (bulk.tryStoreSync(user, id) == false) {
await bulk.store(user, id);
}
}
The idea is that if you call tryStoreSync() we’ll try to do everything in memory, but it may not be possible (e.g. if we need to flush the buffer). In that case, you’ll need to call the async function store() explicitly.
Given that the usual reason for using the dedicated API for bulk insert is performance, this looks like a reasonable thing to ask. Especially when you can see the actual performance results. We are talking about over 55%(!!!) improvement in the performance of bulk insert.
It gets even better. That was just the mechanical fix to avoid generating a promise per operation. While we are addressing this performance issue, there are a few other low-hanging fruits that could improve the bulk insert performance in Node.js.
For example, it turns out that we pay a hefty cost to generate the metadata for all those documents (runtime reflection cost, mostly). We can generate it once and be done with it, like so:
const bulk = store.bulkInsert();
const metadata = {
"@collection": "Users",
"Raven-Node-Type": "User"
};
for (let i = 0; i < 100_000_000; i++) {
const user = new User('user' + i);
const id = "users/" + i;
if (bulk.tryStoreSync(user, id, metadata) == false) {
await bulk.store(user, id, metadata);
}
}
await bulk.finish();
And this code in particular gives us:
That is basically near enough to the C#’s speed that I don’t think we need to pay more attention to performance. Overall, that was time very well spent in making things go fast.