Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert all dir functions to just use libuv with windows long path support #1780

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

MasterDuke17
Copy link
Contributor

This is #1745 + #1776 to see what the combination looks like in CI.

MasterDuke17 and others added 9 commits November 26, 2023 23:47
instead of doing our own `#define`s for Windows support.

Almost all the logic is the same. However, libuv removed '.' and '..'
from readdir results, and since Rakudo (and roast) expects them, this
will require a minor change in Rakudo. The JVM also doesn't return those
for its readdir implementation, so there's already code that deals with
that in Rakudo in `#?if jmv`s, which will just need to be adjusted to
be `#?if !js`.
On Windows if you want path parts longer than MAX_PATH then you
need to use DOS paths, i.e. absolute paths prefixed with \\?\.
Additionally, using DOS paths does not work with / so those slashes
need to be normalized when prefixing paths with \\?\. This adds
the aforementioned prefix and normalization on Windows, allowing
long paths to be used on that OS.

Note this is a Proof of Concept and I haven't spent any time thinking
about the design or what I may have otherwise broke.
This is more code that was doing its own \\?\ prefixing, which
does not work as expected now that we are prefixing the paths before
calling the function. This remove the MVM_platform_unlink function
that was used on Windows, such that all platforms use the same
unlink function.
In a previous commit the windows specific code for mkdir_p was
removed such that all platforms would use the same logic. However,
the non-Windows logic had a S_ISDIR check that I temporarily
removed due to it not being available on Windows.

This adds the S_ISDIR macro on Windows, and readds the original
S_ISDIR check to our mkdir logic.
Absolute paths of any length can use the \\?\ prefix, so we
initially implemented things without checking the length. However,
the native nqp and rakudo runners currently use absolute paths that
contain `..` as a path part which is literal (i.e. not the parent
directory) when used with the \\?\ prefix. This checks the path
length to see if it is longer than MAX_PATH so we avoid the issue
with the nqp/rakudo runners in most situations while still getting
some immediate relief to other problems stemming from Windows path
length issues.
@MasterDuke17
Copy link
Contributor Author

MasterDuke17 commented Nov 27, 2023

‘Failed to remove the directory 'C:\Users\VSSADM~1\AppData\Local\Temp\rakudo-precompTXX99GWZUNWDWHT3Z9CPGG99\34310A559C756907FABDDC38457C1B3DCE353841': Failed to rmdir: directory not empty
@ugexe does this look familiar?

@MasterDuke17
Copy link
Contributor Author

Oh, and why doesn't that path start with \\?\?

@ugexe
Copy link
Contributor

ugexe commented Nov 28, 2023

The path doesn't start with \\?\ because of ee63bb3

@ugexe
Copy link
Contributor

ugexe commented Nov 28, 2023

Failed to remove the directory 'C:\Users\VSSADM~1\AppData\Local\Temp\rakudo-precompTXX99GWZUNWDWHT3Z9CPGG99\34310A559C756907FABDDC38457C1B3DCE353841': Failed to rmdir: directory not empty

Hmm I'm not sure what that error is about. My initial guess would be that a file or directory isn't getting deleted (maybe because a handle is left open?) but I'm sure you already suspected that.

@ugexe
Copy link
Contributor

ugexe commented Nov 30, 2023

It seems to be caused by https://github.com/rakudo/rakudo/blob/c3e68caa1b84c825b9ba4aff91602168ab8645be/t/02-rakudo/reproducible-builds.t#L35 (i.e. the code in the test itself, not something in rakudo). I wonder if the test would pass if it was rerun a couple of times... and if that works I wonder if we need to add another 'retry filesystem operation on windows' thing we added elsewhere in the past.

edit: hmmm, but the reloc build has the same error which suggest retrying the test probably wouldn't help...

@MasterDuke17
Copy link
Contributor Author

edit: hmmm, but the reloc build has the same error which suggest retrying the test probably wouldn't help...

Well, maybe it needs a couple retries.... I'll restart those jobs to see if another run works.

@ugexe
Copy link
Contributor

ugexe commented Dec 1, 2023

I went to try and debug this, but unfortunately my windows VMs are making this impossible. I can't build moar-blead due to it using a newish libuv. It fails with uv.lib(util.obj) : error LNK2001: unresolved external symbol _GetSystemTimePreciseAsFileTime, and I believe it is related to whatever Windows 11 ARM uses to translate (like the windows equivalent of macOS Rosetta). I haven't had much luck finding hard evidence of that though. Regardless, the newest version of MoarVM I can build on this ARM machine using Windows is 2023.04.

It has to be something related to the libuv changes in some way. I wanted to see debug output added to https://github.com/rakudo/rakudo/blob/73470580c9286b8257be9175fd8bff57f3b14bb4/src/core.c/CompUnit/PrecompilationStore/FileSystem.rakumod#L230C1-L237 so we could see what is there initially, what actually does get deleted, etc. That would help pinpoint where in MoarVM to look at harder at least...

@MasterDuke17
Copy link
Contributor Author

I wanted to see debug output added to rakudo/rakudo@7347058/src/core.c/CompUnit/PrecompilationStore/FileSystem.rakumod#L230C1-L237 so we could see what is there initially, what actually does get deleted, etc. That would help pinpoint where in MoarVM to look at harder at least...

I think it might be possible to get CI to use a different MoarVM and/or Rakudo repo. So create a Rakudo PR that adds debugging info to that function, and then also changes the Rakudo CI workflow to use my MoarVM repo on this branch (or vice versa, add a Rakudo branch that adds the debugging and then add a change to this PR to point there).

src/io/dirops.c Outdated Show resolved Hide resolved
@MasterDuke17 MasterDuke17 force-pushed the use_more_libuv_dir_functions_with_windows-long-path-support_from_ugexe branch from 3ee5dcf to ba4128d Compare December 3, 2023 22:43
src/io/dirops.c Outdated Show resolved Hide resolved
src/io/dirops.c Show resolved Hide resolved
Windows doesn't always close the filehandle right away, so if we get an
ENOTEMPTY, sleep and try again (ignoring ENOENT the second time, since
that would just mean the delete/close happened during the sleep).
@MasterDuke17 MasterDuke17 force-pushed the use_more_libuv_dir_functions_with_windows-long-path-support_from_ugexe branch from ba4128d to c75d40e Compare December 3, 2023 23:44
@ugexe
Copy link
Contributor

ugexe commented Dec 4, 2023

Hmm, it is definitely confusing that having the sleep in MoarVM seems to not fix the issue that putting the sleep in Rakudo does 🤔

@MasterDuke17
Copy link
Contributor Author

Hmm, it is definitely confusing that having the sleep in MoarVM seems to not fix the issue that putting the sleep in Rakudo does 🤔

However, the first sleep added in rakudo was before the rmdir, not after the first call to it. I'll see if that changes anything.

@ugexe
Copy link
Contributor

ugexe commented Dec 4, 2023

One hail mary I can think of would be updating the deletion code to hopefully ensure that .dir isn't still open when deleting:

    method delete-by-compiler(CompUnit::PrecompilationId:D $compiler-id) {
         my $compiler-dir := self.prefix.add($compiler-id);
         for $compiler-dir.dir.list -> $subdir {
             .unlink for $subdir.dir.list;
             $subdir.rmdir;
         }
         $compiler-dir.rmdir;
    }

which changes .dir to .dir.list

@MasterDuke17
Copy link
Contributor Author

Of course what's crazy is that rakudo/rakudo#5483 is now passing.

@MasterDuke17 MasterDuke17 force-pushed the use_more_libuv_dir_functions_with_windows-long-path-support_from_ugexe branch from d704be1 to 4ae5717 Compare December 4, 2023 22:23
@ugexe
Copy link
Contributor

ugexe commented Dec 7, 2023

fwiw I cut and paste all the changes one by one into my franken-VM on 2023.04 and the reproducible build test passes every time without the need to retry.

@MasterDuke17
Copy link
Contributor Author

MasterDuke17 commented Dec 7, 2023 via email

@ugexe
Copy link
Contributor

ugexe commented Dec 7, 2023

Im not sure. Maybe in the CI test we can cd to libuv, checkout an older version, and do nmake install to test that. Or maybe @coke can try running that test 😬

@ugexe
Copy link
Contributor

ugexe commented Dec 7, 2023

Just to be clear: my VM is running libuv v1.44.1, so just reverting v1.47.0 wouldn't put it at the same version. Looking at the changelog though I'm not sure the libuv version would be affecting this.

@lizmat
Copy link
Contributor

lizmat commented Jul 29, 2024

ping @niner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants