The siren song of the framework
In a recent blog post, I talked about a problem we found in NLog with our usage scenario. I mentioned that we’ll probably have our own solution to fit our needs. Eric Smith replied with:
So I guess there is no chance of contributing to the NLog project to improve it instead of creating yet another logging framework?
And the short answer for that is that there is very little chance for that. Primarily because what we are doing is not really suitable for consumption elsewhere.
Let us consider something that we found to be very useful. The Metrics.NET is a great way to get various metrics, and we have been using it (and contributed to it) for several years. But when looking at our 4.0 work, we also looked at what it would take to support metrics, and we had to flat out reject Metrics.NET for our use. Why is that?
Primarily because it is doing too much. It is extremely flexible and can do quite a lot, but we don’t need it to do a lot, and we don’t want to pay for the stuff we don’t use. For example, Histograms. The problematic issue for us was here:
public void Update(long value, string userValue = null) { this.last = new UserValueWrapper(value, userValue); this.reservoir.Update(value, userValue); }
Which ends up calling:
public void Update(long value, string userValue = null) { long c = this.count.Increment(); if (c <= this.values.Length) { values[(int)c - 1] = new UserValueWrapper(value, userValue); } else { long r = NextLong(c); if (r < values.Length) { values[(int)r] = new UserValueWrapper(value, userValue); } } }
So that means that we’ll have two allocations for each time we update the histogram, which can happen a lot. We don’t want to pay that price.
Now, we can certainly modify Metrics.NET to meet our needs, in fact, we have done just that, and ended up about 6 files that we needed, after we cleaned them up according to our current requirements. Contributing them back wouldn’t make much sense, they are full of constraints that are meant to allow us to run a database server for months on end with predictable and reliable performance under very high load. They don’t make sense in many other scenarios, and it can be incredibly limiting.
For a logging framework, we can most certainly add another option to NLog to do what we consider the right thing, but at this point, that wouldn’t really make sense. We don’t use a lot of the features, and reduce the surface area would mean reducing complexities. We can target something very narrow and get very high performance (in both CPU utilization and memory costs) much more easily & cheaply than trying to convert an existing codebase.
This is not a new concept. There is a very real cost to making something reusable. And in our case, RavenDB doesn’t need frameworks internally, we have a very good grasp about what is going on, and we want to have full ownership of every bit that is executed.
Comments
that's a good decision. there is no reason to pay for flexibility if you don't need it. sometimes there are ways to get the best of all worlds (overriding methods for instance). in the case of database - as a user, I don't really care how its implemented, i just want the best performance / throughput i can get.
long r = NextLong(c); if (r < values.Length) { values[(int)r] = new UserValueWrapper(value, userValue); }
I do not understand the above line. Is there missing an ELSE-part or is "if (r < values.Length)" - redundant?
Moreover is this.values.Length and values.Length the same?
Is NextLong a random number from ThreadLocalRandom.NextLong? Is any protection against overwrite needed?
"we want to have full ownership of every bit that is executed." I always doubt how true this is when the product is built on managed platform.
@Zdeslav, Generally, you're correct. but the product has a lot of un-managed code, and the platform gives you a rock solid infrastructure. so you get the best of both worlds - performance and stability.
@Uri just agree with @Zdeslav. For a Database there is no "a lot of un-managed code". Today, where fast is not fast enough, you can sell your managed ownership only to morons which stuck with enterprisy thinking "more paid more get" But agree with a main thought too - small lib, even self made, is always better than any "rule-them-all-somehow" frameworks. And yes, business is rarely paid for altruism. Unfortunately
Zdeslav, You can say that about the C standard library too, or running on anything but a unikernel you have personally customized. Except that this doesn't actually work. Being able to control what is going on in my environment is important. Choosing a platform is one of the ways you are doing that.
Well, C standard lib is limited too, but to a lesser extent.
In C/C++ you can replace every library function with your own optimized function where you can even write assembly embedded within your C/C++ code, so this is as close to controlling every single bit as you can get, since you talk directly to CPU (let's not nitpick :) Or you can write a custom allocator and work directly with memory pages or ensure that memory is not paged out. These are my actual experiences, not some made up examples. You can still say that it is working on virtual memory so it is not really bare metal, but I would call that nitpicking, too - we are not talking custom hardware/OS :)
Actually, controlling 'every bit' on a platform where you can't know exactly when GC will run is by definition impossible (though scheduling is impossible to control even with native implementation on a non-RTOS). Even if you go as far as to provide custom implemention of CLR hosting interfaces like IHostMemoryManager/IHostMalloc, some things are near impossible. And that's just one area where it lacks control.
Of course, I am not saying that writing C or assembly is the only valid approach - actually, most of the time it is dumb and I am actually a fan of .NET which has lots of advantages over native implementations. I am currently working on software which was written in C++ because of 'max. performance' and it would probably perform the same if it was written in .NET with less code and less bugs. My point is that 'maximum control' is often used to justify NIH, while not actually having maximum control.
Zdeslav you don't need real time for an database system. You don't need to stop the garbage collector from running, or control it to the nanosecond. The presence of a garbage collector does't mean you can't get high throughput. You simply need to design the application so the garbage collector uses very little hardware resources. Native or managed in a high throughput long running server application you still need to control allocations, use resource pools and buffers and in general be very wary of allocations (allocations, managed or native, can ruin the throughput and performance ). The simple presence of a garbage collector does not automatically mean you get free perf by switching to a native alternative.
@PopCatalin
that's a straw man :)
Of course you don't need RT for database. I was talking about maximum control which is harder in non-RTOS, and even harder on .NET - that's why I think that 'maximum control' argument doesn't make sense in this context: replacing NLog with your own implementation is not that, though it makes perfect sense to do it.
I did my share of optimizing .NET memory usage patterns in server apps, so I am aware of the things you describe. Actually, if you read the last paragraph in my previous comment, I believe that using .NET instead of native impl can even be beneficial for such applications. However, there is no doubt that native implementation gives you more opportunity to control and optimize memory usage - and the starting point of my comment was the alleged need for maximum control. 99% of time what you need is enough control which is orders of magnitude less than maximum, even if we ignore extremes like custom silicon, special unikernels and stick within common HW and OS.
@Zdeslav Not entirely true, letting memory bugs aside from the discussion.
You can control the assembler you want to emit when it really means a difference. For example, our implementation of xxHash64 and Metro128 hash functions is such an example. We even got Microsoft to write specific JIT morphers to actually generate very tuned code for some primitives (essentially do a far better job at optimizing those hot paths). Moreover, those improvements actually made a big difference (measurably speaking) in SHA implementations too.
With high-quality platforms like .Net. Having control where it matters is more about the tools at your disposal than the framework you are running into. I could sort far more data from my .NET application using a sorting network on a GPU than you can sort with a very tuned SIMD assembler implementation in the CPU (even paying for the cost of moving the data to the GPU). Is that the "future" for RavenDB? We dont know, but in the direction of "controlling" the environment, eventually even those questions will beg for answers.
Comment preview