Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: skip methods for precompilation? #12

Closed
henry2004y opened this issue Jun 19, 2022 · 10 comments
Closed

Feature request: skip methods for precompilation? #12

henry2004y opened this issue Jun 19, 2022 · 10 comments

Comments

@henry2004y
Copy link
Contributor

henry2004y commented Jun 19, 2022

Hi,

I have been looking at a very strange allocation in searchsorted or insorted used in my package Vlasiator.jl. I notice that this somehow relates to the usage of PrecompileSignatures.jl: if I manually precompile the methods detected by PrecompileSignatures.jl instead of relying on @precompile_signatures(Vlasiator), the strange allocation of tiny vectors is gone.

I checked methods detected for precompilation from PrecompileSignatures.precompilables(Vlasiator): when I skip the ones that looks like

 Tuple{Vlasiator.var"##fillmesh#173", Bool, Bool, typeof(Vlasiator.fillmesh), MetaVLSV, Vector{String}}
 Tuple{Vlasiator.var"##write_vlsv#182", Bool, typeof(write_vlsv), String, String, Vector{Tuple{VecOrMat, String, VarInfo}}}
 Tuple{Vlasiator.var"##write_vtk#180", Vector{String}, Bool, Bool, Bool, Bool, typeof(write_vtk), MetaVLSV}
 Tuple{typeof(Vlasiator.__init__)}

It may be hard for me to fully understand the situation, but my feeling is that PrecompileSignatures.jl is aggressively searching for methods that can be precompiled that occasionally leads to unexpected behaviors. Is it possible to skip certain methods for precompilation? For example, I don't know why __init__ needs to be precompiled.

For now I may simply explicitly list all the methods that I think are valid suggested by precompilables. If you have any insights into this issue, please let me know!

@rikhuijzer
Copy link
Owner

rikhuijzer commented Jun 19, 2022

Hi Hongyang,

I have been looking at a very strange allocation in searchsorted or insorted used in my package Vlasiator.jl. I notice that this somehow relates to the usage of PrecompileSignatures.jl: if I manually precompile the methods detected by PrecompileSignatures.jl instead of relying on @precompile_signatures(Vlasiator), the strange allocation of tiny vectors is gone.

Could you provide a MWE? If precompile manages to create a different method than what is created by running via the just-in-time compiler then that would be a Julia bug

It may be hard for me to fully understand the situation, but my feeling is that PrecompileSignatures.jl is aggressively searching for methods that can be precompiled that occasionally leads to unexpected behaviors. Is it possible to skip certain methods for precompilation?

Preferably, I would like to fix those problems instead of asking users to fine-tune such problems. In theory, precompile shouldn't have negative effects apart from some allocations

@henry2004y
Copy link
Contributor Author

henry2004y commented Jun 19, 2022

I tried to come up with a MWE, but I couldn't reproduce the error without using the method in my package. And what makes it even more strange is that for small data files I don't see allocations; it only happens for large data sizes.

Here is how I reproduce the issue:

  1. Check out current master branch of Vlasiator
git clone https://github.com/henry2004y/Vlasiator.jl.git
  1. After the end of precompilations, add the following
# Extra allocation triggered by PrecompileSignatures
precompile(fillmesh, (MetaVLSV, String))
precompile(fillmesh, (MetaVLSV, Vector{String}))
  1. Download the test file from my Google Drive
  2. Run the following julia script
using Vlasiator

file = "bulk.0000003.vlsv"
meta = load(file)
@time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);

On my Linux laptop,

# Master
julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.389661 seconds (468.68 k allocations: 45.322 MiB, 1.75% gc time, 85.16% compilation time)

julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.078479 seconds (496 allocations: 20.267 MiB, 12.25% gc time)
# Master + two extra precompilations of fillmesh
julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.689447 seconds (3.64 M allocations: 137.799 MiB, 2.88% gc time, 87.91% compilation time)

julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.100166 seconds (2.30 M allocations: 67.053 MiB, 15.35% gc time)

Note that on current master I removed

using PrecompileSignatures: @precompile_signatures
@precompile_signatures(Vlasiator)

I know that extra allocations are solely from this line. isempty(searchsorted(ids, id)) is the same as !insorted(id, ids) from sorted.jl. Alternatively I can use in: that doesn't trigger extra allocations, but is slower in all cases.


I agree that this seems to be a Julia bug that is unveiled by PrecompileSignatures.jl.

@rikhuijzer
Copy link
Owner

Thanks for the extensive information! Before I go in and download the files and set everything up, could you double check that the problem persists if you benchmark with @time @eval instead of @time? For compilation time, use @time @eval as is described in the docstring of @time:

In some cases the system will look inside the @time expression and compile some of the called code before execution of the top-level expression begins. When that happens, some compilation time will not be counted. To include this time you can run @time @eval ....

@rikhuijzer
Copy link
Owner

and which Julia version are you running?

@henry2004y
Copy link
Contributor Author

I am running on Julia v1.7.3. Results with @time @eval:

# Master 1st time
1.288102 seconds (3.02 M allocations: 179.112 MiB, 4.44% gc time, 93.05% compilation time)
# Master 2nd time
0.107818 seconds (609 allocations: 20.273 MiB, 19.43% gc time)
# Master + 2lines 1st time
1.105118 seconds (3.66 M allocations: 138.418 MiB, 2.61% gc time, 90.55% compilation time)
# Master + 2lines 2nd time
0.117375 seconds (2.30 M allocations: 67.058 MiB, 18.13% gc time)

@rikhuijzer
Copy link
Owner

Hmm there is definitely something interesting going on there. Thanks for letting me know. Unfortunately, I will likely not investigate this further. The first times are as expected namely a large decrease in allocations and a small decrease in running time. The running time should become much lower once Julia caches binaries too (JuliaLang/julia#44527).

Till that time, what you can do is to add a real-time workload to your package too as is described in https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_parcel/. See also the Makie example in the README of this repository. The workload can trigger compilation on methods that PrecompileSignatures can’t hit and PrecompileSignatures can trigger compilation on methods that the workload doesn’t hit

@henry2004y
Copy link
Contributor Author

Thanks for the tips! My gut feeling is that somehow PrecompileSignatures in my case accidentally triggers an undesired precompilation that interferes the real-time workload. I will look at SnoopCompile a bit more careful and see what I can achieve.

However, for now it may be still useful to skip certain methods in @precompile_signatures as a temporary solution. I still have no idea what the real cause is.

@rikhuijzer
Copy link
Owner

I will look at SnoopCompile a bit more careful and see what I can achieve.

Yes SnoopCompile is useful and generic profiling can also help a lot with StatProfilerHTML.jl.

Also try Julia 1.8. It's very likely that the problem is solved there. The compiler team is really doing great stuff at the language-side.

@henry2004y
Copy link
Contributor Author

Indeed! I just tried Julia v1.8.0-rc1:

# master + 2lines for precompilation
julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.440889 seconds (932.38 k allocations: 68.000 MiB, 8.31% gc time, 89.19% compilation time)

julia> @time data, vtkGhostType = Vlasiator.fillmesh(meta, ["proton/vg_blocks"]; verbose=true, skipghosttype=false);
[ Info: scanning AMR level 0...
[ Info: reading variable proton/vg_blocks...
[ Info: scanning AMR level 1...
[ Info: reading variable proton/vg_blocks...
  0.048697 seconds (457 allocations: 20.265 MiB)

Can't wait for the official release of Julia v1.8!


One more question: what is the difference between the SnoopCompile way of forcing precompilation by executing code, and precompile as either written explicitly or generated by PrecompileSignatures?

@rikhuijzer
Copy link
Owner

Indeed! I just tried Julia v1.8.0-rc1:

So, let's close the issue? :) I don't think that adding a filter is the right way forward. It shouldn't be necessary.

Can't wait for the official release of Julia v1.8!

I fully agree!

One more question: what is the difference between the SnoopCompile way of forcing precompilation by executing code, and precompile as either written explicitly or generated by PrecompileSignatures?

Calling precompile doesn't actually run the function, it only starts the whole type inference process and tries to guess what to compile via that way. This is very nice for functions which have side-effects, such as writing to disk, since the side-effects do not happen. The drawback is that type inference will not always be able to figure out the types. Running code, on the other hand, will always be able to figure out all the types and can therefore find more methods to compile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants