1. 7
  1.  

    1. 1

      Does anyone know, is this sort of optimisation somethig a modern compiler would do? If you have an array of structs, would a modern compiler say, hey, it’s best if I regroup this in a different order and make memory access more efficient?

      It seems like it should be helpful in some cases. Does it happen?

      1. 3

        Rust reorders fields to eliminate padding. In C and C++ the compiler isn’t allowed to reorder fields, but there is -Wpadding. I don’t know of any languages that reorder based on profiling. I would be very surprised if there’s anything that can detect false sharing and split a struct across different cache lines to avoid it, though it would be cool.

        1. 4

          I would be very surprised if there’s anything that can detect false sharing and split a struct across different cache lines to avoid it, though it would be cool.

          I had an MPhil student implement this in a JVM some years ago, using dynamic profiling to detect which fields were accessed from which threads. It got a 50% speedup in contrived microbenchmarks and made no measurable difference in anything else (presumably because programmers have already worked around the problem in places where it would be performance critical).

          This kind of transform is often not possible because the layout can leak into the abstract machine. In C/C++, even before offsetof, you could take the address of fields and subtract them and get offsets. More importantly, any language that has separate compilation has to agree on the layout between compilation units (which may come from different compilers). Languages that don’t expose offsets into the abstract machine and implementations that do JIT compilation are best suited to this kind of thing.