Skip to content

Conversation

@nojb
Copy link
Collaborator

@nojb nojb commented Jul 25, 2025

Use copy_file_range when supported (Linux kernel >= 5.19). We check the kernel version at runtime (but we still require the copy_file_range function to be available at compile-time; we could avoid this by loading the symbol with dlsym).

Fixes #12071

Copy link

@Janno Janno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for implementing this! I can't yet try it out because of unrelated build issues but I will give it a go once those are resolved.

Comment on lines 120 to 127
if (kernel_version() < KERNEL_VERSION(5, 19, 0)) {
dune_copyfile = &dune_sendfile;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that there was no reason to not use FICLONE starting from kernel version 4.5. I don't know how relevant these kinds of systems are but if it is easy enough to add it would be great to include that case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, copy_file_range will do a COW copy if the file system supports it, but otherwise do a normal copy. FICLONE on the other hand, will fail if the file system does not support COW, which would require yet another fallback in those cases. Am I missing anything?

Copy link

@Janno Janno Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% sure about that. Given the things I read about copy_file_range and how it changed significantly before 5.19 it is possible that there could be versions between 4.5 and 5.19 where FICLONE is strictly better. But I can't point to code to substantiate that.

EDIT: Sorry, I did not really answer your question. Yes, FICLONE may fail AFAICT. Given your comment below it seems like so can copy_file_range so I am not sure that this property should influence the decision of whether the code should try FICLONE on kernels between 4.5 and 5.19.

@nojb
Copy link
Collaborator Author

nojb commented Jul 25, 2025

Note that judging by https://lwn.net/Articles/846403/, copy_file_range can fail on certain files/file systems, so the code should fall back to some other copying mechanism in those cases.

@rgrinberg
Copy link
Member

(but we still require the copy_file_range function to be available at compile-time; we could avoid this by loading the symbol with dlsym).

What about using a compile time check for the availability of this function? This might require updating the bootstrap script, but I could help with that.

@nojb
Copy link
Collaborator Author

nojb commented Jul 26, 2025

(but we still require the copy_file_range function to be available at compile-time; we could avoid this by loading the symbol with dlsym).

What about using a compile time check for the availability of this function? This might require updating the bootstrap script, but I could help with that.

Adding this check would allow compiling Dune in old systems that do not have the function, but it would not allow compiling the binary in an old system and have it use the new function it in a newer system (I guess such scenarios are not so uncommon with the advent of binary distributions of Dune?).

Instead, in 775b4f1, I delay resolving the copy_file_range symbol until runtime (using dlsym).

Unrelatedly, in 87fb574 I introduce a fallback from copy_file_range() to sendfile() in case the former fails.

@nojb
Copy link
Collaborator Author

nojb commented Jul 26, 2025

It would be good to have at least one confirmation that this is working as expected (I cannot test easily myself at the moment). @Janno: would you be able to test and report back here?

@rgrinberg
Copy link
Member

rgrinberg commented Jul 26, 2025

Checking for things at runtime is fine as well. I'm wondering if it's going to work for statically linked libc's though (e.g. musl). In any case, we don't have to support everything right away

@Janno
Copy link

Janno commented Jul 28, 2025

It would be good to have at least one confirmation that this is working as expected (I cannot test easily myself at the moment). @Janno: would you be able to test and report back here?

Thank to @Alizter's help in #12075 I was able to test this and it does indeed work correctly. Awesome!

If anyone is interested in my test setup: I found https://github.com/pwaller/fienode which can tell if files are reflinks of each other. I created a test project that builds a single binary with lots of hello world prints in it. I first built it with the dune version I have installed, used fclones group to find copies of _build/default/foo.exe in the dune cache, and ran fienode on the original file and all copies in the cache and found no sharing. Then I removed one of the hello world lines of my code and rebuilt with the branch of this PR and reran the fclones and filenode steps. This time the copies are actually reflinks.

@rgrinberg
Copy link
Member

@Janno are you able to observe any performance benefits? I did manage to observe a speed boost when I experimented with this.

@Janno
Copy link

Janno commented Jul 30, 2025

@Janno are you able to observe any performance benefits? I did manage to observe a speed boost when I experimented with this.

No. In fact, the opposite is the case for me. Compiling rocq, for example, consistently yields a ~4% slowdown.

$ hyperfine -w 1 \
  -p "bash -c 'rm -rf ~/.cache/test1/* && rm -rf _build'" \
  "bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune/dune.exe b'" \
  "bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune-copy_file_range/dune.exe b'"
Benchmark 1: bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune/dune.exe b'
  Time (mean ± σ):     11.271 s ±  0.073 s    [User: 159.119 s, System: 27.324 s]
  Range (min … max):   11.196 s … 11.390 s    10 runs
 
Benchmark 2: bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune-copy_file_range/dune.exe b'
  Time (mean ± σ):     11.769 s ±  0.093 s    [User: 147.380 s, System: 26.430 s]
  Range (min … max):   11.636 s … 11.942 s    10 runs
 
Summary
  bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune/dune.exe b' ran
    1.04 ± 0.01 times faster than bash -c 'ulimit -s unlimited && DUNE_CACHE_ROOT=/home/janno/.cache/test1 ~/src/dune-copy_file_range/dune.exe b'

For reference, here are the commits used:

$ git -C ~/src/dune log | head -n 1
commit 6c8a5a339c7cdbcda6cc70ba0556fc5eec765977 
$ git -C ~/src/dune-copy_file_range/ log | head -n 1
commit 0ffab48c941064672e24041034f6e13af8820aaa
$ git -C ~/src/rocq/ log | head -n 1
commit 81ded19eab08cfba8d5065b189f7ed80657b3d83

If anybody wants to repeat the benchmark: rocq needs make dunestrap before it can be built with dune.

I have rechecked the files in the cache to make sure they are still reflinks (and that they are only reflinks with the branch of this PR) and that part still works fine.

@Janno
Copy link

Janno commented Aug 7, 2025

I have now tried benchmarking some other projects for which I already had the dependencies installed (all with dune b --release in case that matters) and I am seeing mixed but overall positive results:

  1. a 3-4% speedup on ocaml-re,
  2. a very slight 0-1% speedup on memtrace_viewer (tag v0.16.0),
  3. a 1% speedup on bonsai, (tag v0.16.0
  4. and a 2-3% slowdown on base (tag v0.16.0).

So it seems the performance impact strongly depends on the project that is being built (and possibly many other factors). Personally, I don't think any of these performance changes should matter for getting this MR merged given that it can drastically reduce the disk space used by dune.

@nojb
Copy link
Collaborator Author

nojb commented Aug 11, 2025

Thanks for the testing @Janno! Do you have an intuition/explanation for why the changes in this PR could result in a slowdown?

@Janno
Copy link

Janno commented Aug 12, 2025

No, but I think it's possible my benchmarks are not really measuring the right thing. I have started adding calls to sync to the warmup script and also to the end of the command(s) being timed and it has reversed some of the relative measurements as well as generally reduced the magnitude of the speedups/slowdowns. I do not see deltas above 1% +/-1% any more which probably means I am getting close to reading tea leaves. (Of course, sync is impacted by whatever else is going on on the machine and I have not actually killed all irrelevant processes before running these benchmarks. The results are still reproducible, though, whatever that might be worth..)

I don't see anything in the change itself that would suggest a regression but I am also not an expert in systems programming.

@nojb
Copy link
Collaborator Author

nojb commented Aug 19, 2025

Thanks for the feedback @Janno. Based on what I understand from your comments, the change here has no impact on time, but should provide nice savings on space (on supported filesystems). So it seems that we should go ahead and merge.

@rgrinberg: as you approved some weeks ago, I suspect you agree.

I will wait a day or two and then merge. Thanks everyone for your participation!

nojb added 6 commits August 20, 2025 14:10
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
@nojb nojb force-pushed the copy_file_range branch from 0ffab48 to 847174b Compare August 20, 2025 12:10
Signed-off-by: Nicolás Ojeda Bär <[email protected]>
@nojb nojb merged commit 13da9c7 into ocaml:main Aug 20, 2025
23 of 24 checks passed
@nojb nojb deleted the copy_file_range branch August 20, 2025 13:08
Leonidas-from-XIV pushed a commit that referenced this pull request Dec 9, 2025
Before this change, it was producing the category subheadings at the
wrong level (level 2), instead of level 3.

An example of the corrected changelog produced:

```
3.21.0 (2025-12-08)
--------------------

### Fixed

- Fix `include_subdirs qualified` incorrectly picking the furthest module
  instead of the closest when resolving module name ambiguities. (#12587,
  @ElectreAAS and @Alizter)

- Fix: include the module alias in the transitive dependency closure with
  `(include_subdirs qualified)`. (#12299, @anmonteiro)

- Pass private modules with -H when this is available (#12666, @rgrinberg)

- Allow multiple modules in `(modules_flags ...)`, in `coq.theory` (#12733, @rlepigre)

- Improve error message for invalid version formats in both `(lang dune ...)` and
  `(using extension ...)` declarations. Changes "Atom of the form NNN.NNN expected"
  to "Invalid version. Version must be two numbers separated by a dot." (#12833, @benodiwal)

- Fix crash when running `dune build @check` on a library with virtual modules.
  (#12644, fixes #12636, @Alizter)

- Provide a more informative error message when `(pkg enabled)` is put in
  `dune-project` instead of `dune-workspace`. (#12802, fixes #12801,
  @benodiwal)

- Improve error message when invalid version strings are used in `dune-project`
  files. Non-ASCII characters and malformed versions now show a helpful hint
  with an example of the correct format. (#12794, fixes #12751, @benodiwal)

- Stop hiding the `root_module` from the include path (#12239, @rgrinberg)

- Allow `$ dune init` to work on absolute paths (#12601, fixes #7806,
  @rgrinberg)

- `(include_subdirs qualified)`: Add missing alias dependency to module group.
  (#12530, @anmonteiro)

- Add Melange compilation to the `@all` alias in libraries (#12628,
  @anmonteiro)

- melange support: don't emit empty JavaScript modules for generated module
  aliases. (#12464, @anmonteiro)

### Added

- (Experimental): Introduce the `library_parameter` stanza. It allows users to
  declare a parameter when using the OxCaml compiler.
  (#11963, implements #12084, @maiste) 


- Added the ability to scroll horizontally in TUI. (#12386, @Alizter)

- Feature: Include shell command that was executed when a cram test has
  occurred in the error message (#12307, @rgrinberg)

- Add support for `%{cmt:...}` and `%{cmti:...}` variables to reference
  compiled annotation files (.cmt and .cmti) containing typed abstract syntax
  trees with location and type information. (#12634, grants #12633, @Alizter)

- Add `$ dune describe tests` to describe the tests in the workspace
  (@Gromototo, #12545, fixes #12030)

- Allow `dune runtest` to properly run while a watch mode server is running.
  (#12473, grants #8114, @gridbugs and @ElectreAAS)

- Use copy-on-write (COW) when copying files on filesystems that support it
  (Btrfs, ZFS, XFS, etc), under Linux. (#12074, fixes #12071, @nojb)

- Add support for Tangled ATproto-based code repositories (#12197, @avsm)

- Add support for instantiating OxCaml parameterised libraries.
  (#12561, @art-w)
```

Note that the `Fixed` and `Added` headers are at level 3.

Compare with the previous, incorrect logic, produced initially in the
current release branch:
https://github.com/shonfeder/dune/blob/c5af78ff322225de3e839982600a1180caa951bf/CHANGES.md?plain=1

This incorrect formatting caused the documented release process to
produce release with no changelog (e.g.,
https://github.com/ocaml/dune/releases/tag/3.21.0_alpha0)

Signed-off-by: Shon Feder <[email protected]>
shonfeder added a commit to shonfeder/opam-repository that referenced this pull request Dec 15, 2025
CHANGES:

### Fixed

- Fix `include_subdirs qualified` incorrectly picking the furthest module
  instead of the closest when resolving module name ambiguities. (ocaml/dune#12587,
  @ElectreAAS and @Alizter)

- Fix: include the module alias in the transitive dependency closure with
  `(include_subdirs qualified)`. (ocaml/dune#12299, @anmonteiro)

- Improve error messages for invalid version formats containing non-ASCII
  characters. Previously, non-ASCII characters in version strings (e.g., `(lang
  dune è)` or `(using menhir π3.14)`) would fail with a generic "Invalid file"
  error. Now they display a clear message: "Invalid atom: contains non-ASCII
  character(s). Atoms must only contain ASCII characters." The fix is
  implemented at the lexer level, providing consistent error handling across all
  s-expression parsing. (ocaml/dune#12844, fixes ocaml/dune#12836, @benodiwal)

- Pass private modules with -H when this is available (ocaml/dune#12666, @rgrinberg)

- Allow multiple modules in `(modules_flags ...)`, in `coq.theory` (ocaml/dune#12733, @rlepigre)

- Improve error message for invalid version formats in both `(lang dune ...)` and
  `(using extension ...)` declarations. Changes "Atom of the form NNN.NNN expected"
  to "Invalid version. Version must be two numbers separated by a dot." (ocaml/dune#12833, @benodiwal)

- Fix crash when running `dune build @check` on a library with virtual modules.
  (ocaml/dune#12644, fixes ocaml/dune#12636, @Alizter)

- Provide a more informative error message when `(pkg enabled)` is put in
  `dune-project` instead of `dune-workspace`. (ocaml/dune#12802, fixes ocaml/dune#12801,
  @benodiwal)

- Improve error message when invalid version strings are used in `dune-project`
  files. Non-ASCII characters and malformed versions now show a helpful hint
  with an example of the correct format. (ocaml/dune#12794, fixes ocaml/dune#12751, @benodiwal)

- Stop hiding the `root_module` from the include path (ocaml/dune#12239, @rgrinberg)

- Allow `$ dune init` to work on absolute paths (ocaml/dune#12601, fixes ocaml/dune#7806,
  @rgrinberg)

- `(include_subdirs qualified)`: Add missing alias dependency to module group.
  (ocaml/dune#12530, @anmonteiro)

- Add Melange compilation to the `@all` alias in libraries (ocaml/dune#12628,
  @anmonteiro)

- Fix greedy version location in lang declarations. Previously, error locations for
  invalid lang versions would span multiple bytes for multi-byte UTF-8 characters,
  causing carets to appear misaligned and seemingly include the closing
  parenthesis. Now, error locations for ASCII strings show the full length (e.g.,
  "Ali" shows `^^^`), while non-ASCII strings show only the first byte (e.g., "è"
  shows `^`) to avoid multi-byte character display issues. (ocaml/dune#12869, fixes ocaml/dune#12806,
  @benodiwal)

- melange support: don't emit empty JavaScript modules for generated module
  aliases. (ocaml/dune#12464, @anmonteiro)

### Added

- (Experimental): Introduce the `library_parameter` stanza. It allows users to
  declare a parameter when using the OxCaml compiler.
  (ocaml/dune#11963, implements ocaml/dune#12084, @maiste)

- Added the ability to scroll horizontally in TUI. (ocaml/dune#12386, @Alizter)

- Feature: Include shell command that was executed when a cram test has
  occurred in the error message (ocaml/dune#12307, @rgrinberg)

-  support expanding variables in `(promote (into ..))` (ocaml/dune#12832, fixes ocaml/dune#12742,
   @anmonteiro)

- Add support for `%{cmt:...}` and `%{cmti:...}` variables to reference
  compiled annotation files (.cmt and .cmti) containing typed abstract syntax
  trees with location and type information. (ocaml/dune#12634, grants ocaml/dune#12633, @Alizter)

- Add `$ dune describe tests` to describe the tests in the workspace
  (@Gromototo, ocaml/dune#12545, fixes ocaml/dune#12030)

- Add `argv`, the process environment, and the dune version to the config event
  in the trace (ocaml/dune#12909, @rgrinberg)

- Allow `dune runtest` to properly run while a watch mode server is running.
  (ocaml/dune#12473, grants ocaml/dune#8114, @gridbugs and @ElectreAAS)

- Use copy-on-write (COW) when copying files on filesystems that support it
  (Btrfs, ZFS, XFS, etc), under Linux. (ocaml/dune#12074, fixes ocaml/dune#12071, @nojb)

- Add support for Tangled ATproto-based code repositories (ocaml/dune#12197, @avsm)

- Add support for instantiating OxCaml parameterised libraries.
  (ocaml/dune#12561, @art-w)

- Add a `(conflict_markers error|ignore)` option to the cram stanza. When
  `(conflict_markers error)` is set, the cram test will fail in the presence of
  conflict markers. Git, diff3 and jujutsu conflict markers are detected.
  (ocaml/dune#12538, ocaml/dune#12617, ocaml/dune#12655, fixes ocaml/dune#12512, @rgrinberg, @Alizter)

- Introduce a `%{ppx:lib1+..+libn}` stanza to make it possible to refer to ppx
  executables built by dune. This is useful for writing tests (ocaml/dune#12711,
  @rgrinberg)

- Introduce a `(dir ..)` field on packages defined in the `dune-project`. This
  field allows to associate a directory with a particular package. This makes
  dune automatically filter out all stanzas in this directory and its
  descendants with `--only-packages`. All users are recommended to switch to
  using this field. (ocaml/dune#12614, fixes ocaml/dune#3255, @rgrinberg)

- Add support for `DUNE_ROOT` environment variable, similar to the existing
  `--root` CLI parameter. (fixes ocaml/dune#12399 @sir4ur0n)

- Introduce an `unused-libs` alias to detect unused libraries.
  (ocaml/dune#12623, fixes ocaml/dune#650, @rgrinberg)

- Add `--files` flag to `dune describe opam-files` to print only the names of
  the opam files line by line. (ocaml/dune#9793, @reynir and @Alizter)

- `dune exec` now accepts absolute paths inside the workspace.
  (ocaml/dune#12094, @Alizter)

- Add `coqdoc_header` and `coqdoc_footer` fields to the `coq` field of the
  `env` stanza, and to the `coq.theory` stanza, allowing to configure a
  custom header or footer respectively in the HTML output of `coqdoc`.
  (ocaml/dune#11131, @rlepigre)

- Allow `dune fmt` to properly run while a watch mode server is running.
  Note that the `--preview` flag is not supported in this mode.
  (ocaml/dune#12064, @ElectreAAS)

- Support for generating `_CoqProject` files for `coq.theory` stanzas.
  (ocaml/dune#11752, @rlepigre)

- Added `(files)` stanza, similar to `(dirs)` to control which files are visible
  to Dune on a per-directory basis. (ocaml/dune#12879, @nojb)
- Add support for %{ocaml-config:ox} (ocaml/dune#12236, @jonludlam)

- Introduce `dune promotion show` command to display the contents of corrected
  files that are ready for promotion. This allows users to preview changes
  before running `dune promote`. The command accepts file arguments to show
  specific files, or displays all promotable files when called without
  arguments. (ocaml/dune#12669, fixes ocaml/dune#3883, @MixiMaxiMouse)
- New `(lang rocq)` build mode for Rocq 9.0 and later. This new mode
  is very similar to the existing `(lang coq)`, except that it doesn't
  need the `coq*` compatibility wrappers. As of today `(lang rocq)`
  doesn't support yet composed builds with Rocq itself, this will be
  added later.  `(lang coq)` is deprecated, development is frozen, and
  will be removed at some point in the future. (ocaml/dune#12035, @ejgallego,
  @Lysxia, fixes ocaml/dune#11572)

### Changed

- Don't run `ocamldep` to compute false dependencies on the `root_module`
  (ocaml/dune#12227, @rgrinberg)

- `dune format-dune-file` now uses the syntax version of the Dune project that
  contains the file being formatted (if any) instead of using the latest version
  available, which remains the default if there is no Dune project in scope.
  (ocaml/dune#11865, @nojb)

- Persistent DB and process events have been slightly modified. Persistent
  DB events have more concise names and job events always include full
  information. (ocaml/dune#12867, @rgrinberg)

- Removed the `--trace-extended` flag. Its functionality is always enabled when
  tracing is active (ocaml/dune#12908, @rgrinberg)

- The `test/dune` file generated by `dune init proj` now depends on the project library. (ocaml/dune#12791, @shonfeder)

- Starting with version 3.21 of the Dune language, Dune no longer changes the
  default set of compiler warnings. For users that would like to keep the old
  behaviour, the variable `%{dune-warnings}` can be used in an `(env)` stanza in
  a top-level Dune file: `(env (dev (flags :standard %{dune-warnings})))`.
  (ocaml/dune#12766, @nojb)
- Fix: stop generating `cmt` files for cinaps binaries (ocaml/dune#12530, @rgrinberg)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cache: Support Copy on Write (cp --reflink)

4 participants