Optimizing JavaScript and solving the halting problemPart I
RavenDB is a JSON document database, and the natural way to process such documents is with JavaScript. Indeed, there is quite a lot of usage of JS scripts inside RavenDB. They are used for ETL, Subscription filtering, patching, resolving conflicts, managing administration tasks and probably other stuff that I’m forgetting.
The key problem that we have here is that some of the places where we need to call out to JavaScript are called a lot. A good example would be a request to patch request on a query. The result can be a single document modified of 50 millions. If this is the later case, given our current performance profile, it turns out that the cost of evaluating JavaScript is killing our performance.
We are using Jint as the environment that runs our scripts. It works, it is easy to understand and debug and it is an interpreter. That means that it is more focused on correctness then performance. Over the years, we were able to extend it a bit to do all sort of evil things to our scripts ,but the most important thing is that it isn’t actually executing machine code directly, it is always Jint code running and handling everything.
Why is that important? Well, these scripts that we are talking about can be quite evil. I have seen anything from 300 KB of script (yes, that is 300 KB to modify a document that was considerably smaller) to just plain O(N^6) algorithms (document has 6 collections iterate on each of them while iterating on each of the others). These are just the complex ones. The evil ones do things like this:
We have extended Jint to count the number of operations are abort after a certain limit has passed as well as prevent stack overflow attacks. This means that it is much easier to deal with such things. They just gonna run for a while and then abort.
Of course , there is the perf issue of running an interpreter. We turned out eyes to Jurrasic, which took a very different approach, it generate IL on the fly and then execute it. That means that as far as we are concerned, most operations are going to end up running as fast as the rest of our code. Indeed, benchmarks show that Jurrasic is significantly faster (as in, order of magnitude or more). In our own scenario, we saw anything from simply doubling the speed to order of magnitude performance improvement.
However, that doesn’t help very much. The way Jurrasic works, code like the one above is going to hang or die. In fact, Jurassic documentation calls this out explicitly as an issue and recommend dedicating a thread or a thread pool for this and calling Thread.Abort if need. That is not acceptable for us. Unfortunately, trying to fix this in a smart way take us to the halting problem, and I think we already solved that mess.
This limiting issue was the reason why we kept using Jint for a long time. Today we finally had a breakthrough an were able to fix this issue. Here is the PR, but it tells only part of the story, the rest I’ll leave for tomorrow’s post.
Comments
I'm a bit dissatisfied with the availability of CLR scripting languages - the only reasonable ones are Javascript interpreters, but JS is difficult to integrate with .Net programs and doesn't support .Net types in a straightforward way. Boo is a brilliant tool but it's development has almost stopped and it lacks behind the framework more and more with every release, which is a pity because it has both good performance and lots of flexibility. I'd be happy to have Boo scripts/DSLs everywhere, but with better handling of dynamic typing because now it's full of traps and riddles which makes it so difficult for end users to understand. Anyway, will be interested in your ideas for Jint use as an application scripting/DSL tool.
From quickly looking at the code it doesn't seem to handle recursion yet.
In Jurrasic, are you OK with not handling all possible cases? If someone manually writes a loop, or recursion that takes a long time, there would be no way to end that.
Author of Jint here. Speed will really depend on the script and how you can reuse the compiled/parsed expressions. So if you run computing-heavy scripts, or you never change them, then compiling is much better for performance. For small scripts and the ones that don't rely too much on recursion Jint will be faster. There is a benchmark application with the source code that lets you compare Jint, Jurassic and IronJS for this purpose. Right now, improving Jint's speed would probably rely on creating a VM and some bytecode.
Something I would consider also is the ECMA Script compliance. There are so many edge cases in JavaScript that are not handled in the standard unit tests that you will probably break some existing scripts by switching engines. Sometime for good but from my experience I would expect that Jint is more compliant than the other engines. Unless you are based on one using V8/Chakra internally. Take this for instance, run it on different engines: String(14.915832707045631) === '14.915832707045631'. You can probably fix all these cases in Jurassic, maybe by porting how it's done in Jint.
Rafal, One of the major consideration when talking about scripting is what kind of scripts are we talking about? For example, if I'm letting my users customize my application, then something that is easy to write is a priority, and ease of integration is very important. In that aspect, it means that we need to be able to call our code as easily as possible and things like performance and debugging are of utmost consideration.
On the other hand, if you are accepting untrusted input, then you really need to consider a vastly different array of problems. You want to reduce what can be done, and have much tighter control over it.
At that point, there aren't a lot of stuff that you can do. JavaScript is pretty much the standard at this point. With Lua as a very strong contender but with dramatically lower following.
Pop, There is a separate limit on recursion, which takes care of this.
tobi, Remember that there is a limit to what we are doing here. I'm assuming that this is your server and that my role is to protect you from accidents and mishaps. I'm pretty sure that if someone spent enough time and effort, they might be able to get the script into a funny state, but that is pretty manageable.
We haven't been able to see how you can escape it. There is a limit on the amount of code that you can put into a script, the amount of loop iterations we allow, the depth of recursion and you are limited to the JS standard library without being able to call outside of it. Beyond that, I can't really think of anything else that even a malicious actor can do.
Note that a large part of this comes from not needing to do much here, because we are using this for very specific purposes.
Sebastien, In our case, it would be very common to execute the same script 10 million times. Setup times aren't that important for us (and are cached anyway), but actual execution time is critical.
It is a really good point about portability. My saving grace here is that we have already declared 4.0 to not be backward compatabile, so that helps a lot.
Wondering if you've looked at ClearScript as another alternative for running JavaScript?
David, It doesn't support CoreCLR, so that was a breaking deal for us. We actually ported Jurassic tot CoreCLR, so that isn't out of the question, but they are using stuff that isn't available on the platform at all: https://github.com/Microsoft/ClearScript/issues/9
Comment preview