Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster incremental rebuilds of (user-specific) sysimgs #46045

Closed
wants to merge 33 commits into from

Conversation

petvana
Copy link
Member

@petvana petvana commented Jul 15, 2022

This is part of my GSoC'22 project "Automatic system images based on a specific user workflow," under the supervision of @ianatol. The project's main idea is to introduce a new library that would obtain precompile statements during regular usage of Julia. Then the system image is customized for a specific user. So, it's crucial to speed up building the system image (sysimage) because it's pretty slow now (empty >2 min). So, this PR rebases #40414 (initially proposed by @Keno) for Julia v1.9-dev. But there are some modifications:

  • It stores some extra information in precompile field of CodeInstance.
    • precompile & 2 - CodeInstance is loaded from a current sysimage.
    • precompile & 4 - CodeInstance should be emitted into the incremental sysimage.
    • These bits are only temporary and not serialized into sysimage when produced.
  • It speeds up building system image by emitting code only for the tagged instances (by precompile & 4).
    • jl_precompiles_for_sysimage function is introduced to tag only user-specific precompile statements. Precompiles generated before this function is called will be ignored.
  • New detection if CodeInstance is already in the sysimage. Originally, it was done only by the naming convention. Now, the instance is not included if contained in ReverseLocalSymbolTable checked by lookup_sysimage_fname, which seems a bit more robust and systematic way to do so.

Given these changes, it's unnecessary to emit binary code for all precompile() statements defined in loaded libraries, resulting in relatively fast compilation times.

In the future, I'll will investigate why there is a greater overhead for Plots (not yet clear to me).

library Nempty Nload Nwork SysImg Sempty Sload Swork
OhMyREPL 0.09 0.37 0.35 7.71 0.10 0.12 0.12
DataFrames 0.10 3.28 5.29 18.63 0.11 0.12 0.16
Plots 0.12 6.34 15.45 51.10 0.23 0.22 1.38
GLMakie 0.25 10.46 77.5 113.3 0.36 0.21 3.37

All times are in seconds, N - normal run, S - using chained sysimage, SysImg - create chained sysimage, load - call using, work - time to first work (TTFX). Sources for the comparison are in BenchmarkIncrementalSysimage.

There are probably some edge cases, but it seems to work fine for now. In the future, I'd like to try to disable precompile() entirely for the included libraries; that would probably save some loading time because the sysimage would contain less inferred CodeInstances.

This PR also introduces a CI test for building an almost empty sysimage. It seems possible to make the test multi-platform by using LLVM_full_jll library that contains all the necessary utils.

Solved issues

  • It may produce multiple definitions (and thus fails if ld --allow-multiple-definition is not used). Solved by initializing globalUniqueGeneratedNames, see 431b6ba.
  • Archive sys-o.a (207MB) is not currently shipped with julia, but it can be re-created easily together with sys.so. This is necessary only once.
  • 32-bit version seems fixed.
  • Most of the CI passes (only for Linux)

Known issues (to be solved) / TODOs

  • It interfere with Base.compilecache. Precompilation do not work when building incremental sysimage, so the pakages need to be already precompiled.
  • Chained sysimages for Windows are not yet supported.

Open questions and ideas for and improvement

  • Would it be possible to copy compiled code directly from the loaded sysimage when outputting the archive?
  • Is it worthy to support chains of multiple sysimages (to enable incrementally adding precompile statements)?

@petvana petvana added performance Must go faster building Build system, or building Julia or its dependencies compiler:precompilation Precompilation of modules labels Jul 15, 2022

@test success(`$(Base.julia_cmd()) --sysimage-native-code=chained --startup-file=no --sysimage=$sysimg --output-o chained.o.a -e $source_txt`)
@test success(`ar x chained.o.a`) # Extract new sysimage files
@test success(`ld -shared -o chained.$(Libdl.dlext) text.o data.o text-old.o`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We now have a LLD_jll that we can add as a stdlib to provide LD.

Copy link
Member Author

@petvana petvana Aug 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'll update the test to be system independent, but I'd probably need LLVM_full_jll because of llvm-objcopy. However, this is only a test, and not a dependency for for building Julia itself.

@vchuravy
Copy link
Member

vchuravy commented Aug 6, 2022

Very exciting!

This sounds very close to what @timholy and I proposed in #44527

Maybe it would make sense to have a meeting together with @ianatol to discuss the best way forward between these two approaches!

@petvana
Copy link
Member Author

petvana commented Jan 18, 2023

I'm closing this PR because pkgimages has been already implemented and merged to master. Moreover, it seems sysimage generation will be parallelized soon.

@petvana petvana closed this Jan 18, 2023
@vchuravy
Copy link
Member

Thanks @petvana for your work and the great discussions this summer about this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies compiler:precompilation Precompilation of modules performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants