Skip to content

Conversation

@danielrainer
Copy link

@danielrainer danielrainer commented Dec 1, 2025

Multiple gettext-extraction proc macro instances can run at the same
time due to Rust's compilation model. In the previous implementation,
where every instance appended to the same file, this has resulted in
corruption of the file. This was reported and discussed in
#11928 (comment)
for the equivalent macro for Fluent message ID extraction. The
underlying problem is the same.

The best way we have found to avoid such race condition is to write each
entry to a new file, and concatenate them together before using them.
It's not a beautiful approach, but it should be fairly robust and
portable.

@danielrainer danielrainer force-pushed the gettext_extract_file_per_message branch 4 times, most recently from 1df6406 to 0cb59f2 Compare December 8, 2025 16:39
Multiple gettext-extraction proc macro instances can run at the same
time due to Rust's compilation model. In the previous implementation,
where every instance appended to the same file, this has resulted in
corruption of the file. This was reported and discussed in
fish-shell#11928 (comment)
for the equivalent macro for Fluent message ID extraction. The
underlying problem is the same.

The best way we have found to avoid such race condition is to write each
entry to a new file, and concatenate them together before using them.
It's not a beautiful approach, but it should be fairly robust and
portable.
@danielrainer danielrainer force-pushed the gettext_extract_file_per_message branch from 0cb59f2 to e592a99 Compare December 8, 2025 16:53
set rust_extraction_dir (mktemp -d)
# We need to build to ensure that the proc macro for extracting strings runs.
FISH_GETTEXT_EXTRACTION_FILE=$rust_extraction_file cargo check --features=gettext-extract
FISH_GETTEXT_EXTRACTION_DIR=$rust_extraction_dir cargo check --features=gettext-extract
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.
I have posted to https://users.rust-lang.org/t/generating-custom-output-files-with-cargo/136828
in case anyone has a different solution, but we can go with this for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could the po files as build inputs as suggested in the fluent-rs issue,
but if that's really a good idea, it can be done later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could the po files as build inputs

While your suggestion would work for Fluent (except for losing the ability to check for stale messages and duplicate IDs), with gettext we use the extraction to update PO files, so if we stop extracting the msgids we would have to manually keep the PO files in sync with the Rust code.

Generally, I think keeping the extraction for both and being able to deal with stale and duplicated messages is worth the somewhat inelegant extraction process.

Anyway, thanks for opening the Rust forum discussion. That should be a good way of figuring out whether we can improve on our extraction approach, and if so, how.


# Get rid of duplicates and sort.
msguniq --no-wrap --sort-output $rust_extraction_file
cat $rust_extraction_dir/* | msguniq --no-wrap --sort-output
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe better to do find $rust_extraction_dir | xargs cat,
in case we ever hit ARG_MAX.
It's 130kB here, so we'd probably hit it if $rust_extraction_dir is >130 char long and we have 1000 messages.

Not sure if we'd notice the failure straight away.
Unlike fish, POSIX shells don't necessarily sort glob results AFAIK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I see that you've already taken care of it when merging.

@krobelus krobelus added this to the fish 4.3 milestone Dec 9, 2025
krobelus pushed a commit to krobelus/fish-shell that referenced this pull request Dec 10, 2025
Multiple gettext-extraction proc macro instances can run at the same
time due to Rust's compilation model. In the previous implementation,
where every instance appended to the same file, this has resulted in
corruption of the file. This was reported and discussed in
fish-shell#11928 (comment)
for the equivalent macro for Fluent message ID extraction. The
underlying problem is the same.

The best way we have found to avoid such race condition is to write each
entry to a new file, and concatenate them together before using them.
It's not a beautiful approach, but it should be fairly robust and
portable.

Closes fish-shell#12125
krobelus pushed a commit to krobelus/fish-shell that referenced this pull request Dec 10, 2025
Multiple gettext-extraction proc macro instances can run at the same
time due to Rust's compilation model. In the previous implementation,
where every instance appended to the same file, this has resulted in
corruption of the file. This was reported and discussed in
fish-shell#11928 (comment)
for the equivalent macro for Fluent message ID extraction. The
underlying problem is the same.

The best way we have found to avoid such race condition is to write each
entry to a new file, and concatenate them together before using them.
It's not a beautiful approach, but it should be fairly robust and
portable.

Closes fish-shell#12125
@krobelus krobelus closed this in 3e23360 Dec 10, 2025
@danielrainer danielrainer deleted the gettext_extract_file_per_message branch December 17, 2025 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants