Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🧂 more precompile for dynamic functions! #1978

Merged
merged 12 commits into from
Mar 24, 2022
Merged

Conversation

fonsp
Copy link
Owner

@fonsp fonsp commented Mar 12, 2022

No description provided.

Repository owner deleted a comment from github-actions bot Mar 12, 2022
@fonsp
Copy link
Owner Author

fonsp commented Mar 12, 2022

@rikhuijzer i realised i don't really know what i'm doing 😅 can you take a look at adding more precompiles for our dynamic things?

@rikhuijzer
Copy link
Collaborator

@rikhuijzer i realised i don't really know what i'm doing sweat_smile can you take a look at adding more precompiles for our dynamic things?

Yeah sure. Gladly. Can you tell give in code or pseudocode are list of methods are called? Or, I can just see what the firebasey sends to the backend and trigger those?

@fonsp
Copy link
Owner Author

fonsp commented Mar 13, 2022

Firebasey.diff gets called on two outputs of notebook_to_js, and it recursively calls itself on their children to compare. Run the Firebasey notebook to see what I mean

src/webserver/Dynamic.jl Outdated Show resolved Hide resolved
@@ -169,6 +171,7 @@ function notebook_to_js(notebook::Notebook)
"cell_execution_order" => cell_id.(collect(topological_order(notebook))),
)
end
precompile(notebook_to_js, (Notebook,))
Copy link
Collaborator

@rikhuijzer rikhuijzer Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight improvement:

Without precompile

$ julia --project -ie 'using Pluto'

julia> @time @eval Pluto.notebook_to_js(Pluto.Notebook([Pluto.Cell("1")]));
  3.078880 seconds (6.69 M allocations: 376.505 MiB, 3.90% gc time, 91.05% compilation time)

With precompile

julia> @time @eval Pluto.notebook_to_js(Pluto.Notebook([Pluto.Cell("1")]));
  3.039852 seconds (6.00 M allocations: 338.446 MiB, 5.37% gc time, 91.06% compilation time)

src/webserver/MsgPack.jl Outdated Show resolved Hide resolved
end
precompile(unpack, (Vector{UInt8},))
Copy link
Collaborator

@rikhuijzer rikhuijzer Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
precompile(unpack, (Vector{UInt8},))

Minor effect

Without precompile

$ julia --project -ie 'using Pluto'
...

julia> @time @eval Pluto.MsgPack.unpack([0x00]);
  0.511228 seconds (1.27 M allocations: 67.586 MiB, 24.02% gc time, 99.97% compilation time)

With precompile

julia> @time @eval Pluto.MsgPack.unpack([0x00]);
  0.454856 seconds (1.24 M allocations: 65.688 MiB, 15.59% gc time, 99.98% compilation time)

@rikhuijzer
Copy link
Collaborator

rikhuijzer commented Mar 14, 2022

I've been going deeper on Pluto.run since we should be able to shave some time off there. So, similarly to #1934, I've used SnoopCompile to figure out where the time is spent. That gives this flamegraph:

image

I've obtained this by running the following notebook:

begin
    using Pkg
    
    Pkg.activate(; temp=true)
    Pkg.add([
        "SnoopCompile",
        "ProfileSVG"
    ])
    Pkg.develop(; path="/home/rik/git/Pluto.jl")
end
begin
    using SnoopCompile
    using ProfileSVG
    using Pluto
end
tinf = @snoopi_deep begin
    port = 13435
    options = Pluto.Configuration.from_flat_kwargs(; port, launch_browser=false, workspace_use_distributed=true, require_secret_for_access=false, require_secret_for_open_links=false)
    🍭 = Pluto.ServerSession(; options)
    server_task = @async Pluto.run(🍭)
    
    retry(; delays=ExponentialBackOff(n=5, first_delay=0.5)) do
        HTTP.get("http://localhost:$port/edit").status == 200
    end
    server_task
end
let
     g = flamegraph(tinf)
     ProfileSVG.view(g)
end

This flamegraph is good news. All inference is rooted on the same block, so if we can get the right precompile statement somewhere on from_flat_kwargs, then we can possibly get some proper improvements. I'll fiddle with it and report back here once I got it.

EDIT: precompile(from_flat_kwargs, ()) is great

Without precompile

$ julia --project -ie 'using Pluto'
...

julia> @time @eval Pluto.Configuration.from_flat_kwargs();
  2.174783 seconds (3.60 M allocations: 194.278 MiB, 8.07% gc time, 99.96% compilation time)

With precompile

julia> @time @eval Pluto.Configuration.from_flat_kwargs()
  1.395046 seconds (2.64 M allocations: 143.641 MiB, 12.13% gc time, 99.92% compilation time)

This takes roughly 0.8 seconds off the time for Pluto.run on my system. I'll add it to #1983. I have hard time measuring this via @time @eval Pluto.run() because via that call I have to pass keyword arguments.

@rikhuijzer
Copy link
Collaborator

rikhuijzer commented Mar 14, 2022

As another possibility, maybe we should just build PackageCompiler.jl into Pluto? Like,

julia> using Pluto

julia> Pluto.compile()

julia> Pluto.run()

to have a super speedy Pluto. Only need to call Pluto.compile once after updating Pluto. And to have the normal Pluto, don't call Pluto.compile() after updating Pluto.

julia> using Pluto

julia> Pluto.run()

EDIT: Nope. Let's hope for JuliaLang/julia PR 44527 and the successors to be merged in newer Julia versions.

@@ -67,6 +67,7 @@ function run(; kwargs...)
options = Configuration.from_flat_kwargs(; kwargs...)
run(options)
end
precompile(run, ())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one works. Although I cannot measure Pluto.run() easily, I can measure Configuration.from_flat_kwargs which is the biggest source of latency according to SnoopCompile

Without precompile(run, ())

$ julia --project -ie 'using Pluto'
...

julia> @time @eval Pluto.Configuration.from_flat_kwargs();
  2.158083 seconds (3.60 M allocations: 194.337 MiB, 7.19% gc time, 99.96% compilation time)

With precompile(run, ())

julia> @time @eval Pluto.Configuration.from_flat_kwargs();
  1.431652 seconds (2.64 M allocations: 143.595 MiB, 11.35% gc time, 99.92% compilation time)

@rikhuijzer
Copy link
Collaborator

rikhuijzer commented Mar 15, 2022

With JET.jl, I've been digging down in why SessionActions.open keeps taking > 10 seconds even though precompile is called on it. I've found one example of a problem. The compiler has a really bad time with maybe_macroexpand:

julia> using Pluto

julia> precompile(Pluto.ExpressionExplorer.maybe_macroexpand, (Expr,))
true

julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(x = 1))
  0.213755 seconds (397.54 k allocations: 20.237 MiB, 31.72% gc time, 100.06% compilation time)

julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(x = 1))
  0.000607 seconds (75 allocations: 3.484 KiB)

So that's more than 100% compilation time 🤣 😬 JET shows 120 lines of runtime dispatches and optimization failures.

EDIT: Fixed most of it. Got this now in a fresh session by fixing some type inference issues:

julia> using Pluto

julia> @time @eval Pluto.ExpressionExplorer.maybe_macroexpand(:(@time 1))
  0.084008 seconds (86.97 k allocations: 4.649 MiB, 99.64% compilation time)

PR at #1991.

@fonsp
Copy link
Owner Author

fonsp commented Mar 23, 2022

AMAZING 🌟🌟

@fonsp
Copy link
Owner Author

fonsp commented Mar 23, 2022

fonsp and others added 3 commits March 23, 2022 12:06
@fonsp fonsp changed the title 🧂 some random precompile everywhere 🧂 more precompile for dynamic functions! Mar 23, 2022
@fonsp fonsp marked this pull request as ready for review March 23, 2022 11:08
@fonsp fonsp merged commit 0c5ce9d into main Mar 24, 2022
@fonsp fonsp deleted the random-precompile-statements branch March 24, 2022 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants