1. 45
    1. 9

      He has a rant in there about go interfaces being hard to satisfy. Yep. All I want is to know where in the stdlib or the current codebase there are types that implement an interface. It shouldn’t be that crazy difficult.

    2. 7

      I still think the best one of these is the Ben Hoyt word counter, which specifically has two versions for each language: the I just did this idiomatically version and the optimizing performance version.

    3. 3

      C#/Go relative performance surprises me. I always thought that these are very similar languages in terms of runtime characteristics (value types, gc, AOTish compilation model, user-land concurrency), with C# being significantly more mature.

      C#’s throughput for heavy case and memory usage seems worse than Go, which is unexpected — I’d expect C# codegen and gc to be marginally better.

      For light case, C# latency percentiles are better, and that’s very surprising as it seems like the case that Go is specifically optimizing for.

      What am I missing from my Go/C# model?

      1. 9

        Glancing quickly at the code, the C# is very unidiomatic (not surprising, since the author admits they’re new to .NET). A lot of that’s stylistic, but the heavy use of mutable structs with getters, rather than readonly structs with direct access, is gonna result in some poor cache performance, and may account for a lot of the discrepancy. (I also suspect some alternative data structures would perform better, but I’d need to check first.)

      2. 3

        At first glance, I had the same thought. But looking closely, it seems that C# and Go had very similar actual performance and throughput results, with worse outliers for C# (GC related, I would guess).

        The weird outliers to me were Swift and Scala, which both seem to get panned quite regularly by people actually trying to use them for the first time. Yet what’s weird (logically irreconcilable) to me is how people who use them all of the time seem to have no major complaints at all.

      3. 2

        I always thought that these are very similar languages in terms of runtime characteristics

        That seems to be reflected by the data. I didn’t have a look at the code, but concluding from the readme the OP doesn’t seem to have much experience at least with .NET, so the result might rather reflect the level of experience, not the achievable performance. Anyway, even between different versions of the CLR there are significant performance differences (see e.g. https://www.quora.com/Is-the-Mono-CLR-really-slower-than-CoreCLR/answer/Rochus-Keller), and if you add the framework there are even bigger differences.

      4. 2

        What am I missing from my Go/C# model?

        I think what’s missing is the repo got traction at somewhat of an unfortunate time, with a few optimization tweaks to rust and Go and Elixir, and none yet to C# or Scala. I first posted it here after completing my naive, unidiomatic implementations in every language, but it didn’t take off.

        I’d say with that version of the code both dotnet and Go implementations were roughly comparable as was the performance, with dotnet having a slight edge. That’s why I posted on /r/rust asking why dotnet was beating rust.

        Following that discussion I made two main changes to the rust which nearly doubled the performance and shot rust to the top: 1) in the response handler, having the TripResponse and ScheduleResponse use a &str of the underlying Trip and StopTime data, rather than cloning strings, and 2) having the TripResponse vec and nested ScheduleResponse vec initialize with the correct capacity which is known ahead of time, rather than starting empty and growing as I appended new items.

        (Oh, and enabling LTO was another boost.)

        Since I day-to-day program in Elixir and typescript, it didn’t really occur to me just how impactful initializing the Vec with the full capacity will be, since that’s not even a thing you can do in those languages. After slapping my forehead and seeing its effect on the rust performance, I made the same change to Go, and it shot up some 30% in requests per second.

        That’s when the someone else re-posted the repo to HN and it took off.

        I expect once I make that same change to C# and re-benchmark, the numbers will be roughly comparable once again. So I think what you’re seeing is the Go and C# implementations have that pretty significant difference right now.

        1. 1

          Aha, thanks, this indeed explains the thing!

          I’d say with that version of the code both dotnet and Go implementations were roughly comparable as was the performance, with dotnet having a slight edge

          Is what I’d expect, and looks like that exactly what happened here! That’s very interesting data-point, for “time-to-performance”

      5. 2

        My guess is that C#, being object oriented, likes to use references when not necessary, which has a significant effect on time and memory usage.

      6. 1

        Both Go and C# have value types, which are not address-taken and so don’t incur any penalty for GC but do incur some overhead from copying. Performance in both languages can vary significantly depending on the degree to which you make use of these.

    4. 1

      I’d love to see some optimisation experts on the various languages/frameworks/platforms have a go at these, with a view to - of course - seeing how fast they can make them, but also how fast they can be while keeping the code idiomatic. I can’t say for sure, but my feeling is that there may be more ‘modern’ idiomatic ways of writing the C# code that could bring its speed up significantly.

      1. 2

        The result of this line of thinking is the Benchmark Game, which is equally useless because it indicates an unreasonably large amount of effort put into each solution. In fact, engaging an “optimization expert” at all means the code you’re working on has unusually high value.

    5. 1

      It’s an order of magnitude slower than the apps which keep everything in memory

      FWIW (on my system) I was unable to see this kind of performance, even when working with an on-disk database. The longest time I saw was 4.6ms, but I typically saw 1.6-3.2ms. IMO the data set is just too small to really form a good benchmark, as the whole thing trivially fits in memory (the indices are only 35M).

      The following version of the query is marginally faster (and easier to read IMO):

      select count(*) from stop_times join trips using (trip_id) where route_id = "Red";
      

      For importing stop_times I counted (yes, so take that time with a grain of salt) while running:

      Oddly, .timer on doesn’t work for this. Perhaps because it’s a dot command.