-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fluent: add initial implementation #11928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f8ca253 to
fb6d7ad
Compare
|
The new commits run |
|
The new commits run `cargo run --package fish-fluent-check` in
Haven't looked into the code but any reason why we can't use "cargo test" ?
(I guess we'd need to declare the *.ftl files as inputs but we already do something similar for po/ in fish-gettext-maps)
I wonder if it's realistic to (long-term) move away from test_driver.py completely and do everything with `cargo test` (not sure if that's idiomatic.. handling subprocesses without a shell can be tricky).
I suppose we could use test_driver easier if we separated the "build" from the "run" step.
BTW we could also write an xtask as a common interface for running tests
cargo test-fish tests/checks/abbr.fish
cargo test-fish tests/pexpects/bind.py
cargo test-fluent
Just an idea, for discoverability.
`check.sh` and CI.
It probably wouldn't hurt to start sharing logic between the two. But of course that's unrelated.
|
fb6d7ad to
4346737
Compare
That's a good idea. I added a test which just runs
I think replacing
I'm generally in favor of replacing (or at least wrapping) the various shell scripts we have with xtasks, to have a unified interface for running everything. Regarding the interface, I'm not sure if it's better to have several different cargo aliases (e.g. |
11d1099 to
2e9b78a
Compare
|
I changed |
|
I changed `unic-langid` to `0.9.5`, since `0.9.6` requires Rust
1.82+.
yeah let's update to Debian Stable's 1.85 (and see if someone complains).
AFAIK this means we still support "macOS 10.12 Sierra (First released 2016)".
|
Sounds good. That version also allows us to migrate to the 2024 Rust edition if we want that. I just checked some of the other distros, and it seems that Fedora keeps up with stable on all releases and Ubuntu has Rust 1.85 as the default in 25.10. Should I open a PR? |
|
> yeah let's update to Debian Stable's 1.85 (and see if someone complains).
Sounds good. That version also allows us to migrate to the 2024 Rust
edition if we want that. I just checked some of the other distros,
and it seems that Fedora keeps up with stable on all releases and
Ubuntu has Rust 1.85 as the default in 25.10. Should I open a PR?
sure, both upgrading to 1.85 and 2024 edition sounds good
This means we can get rid of a bunch of
lints with the `// for old clippy` comment
and things like `// TODO: if-let-chains`.
Here's our poor man's renovatebot: #11960
|
Changing the edition is not that straightforward and might require a fairly large commit, including manual updates, so for now I'll just address the things which have become available in Rust 2021 with the MSRV update. |
2e9b78a to
1f55024
Compare
|
Parts of fluent are licensed apache2-only, which is incompatible with fish's GPLv2, so this is, as far as I can tell, legally unmergeable as-is.
|
|
The code we use is from https://github.com/projectfluent/fluent-rs, which includes both an apache2 and a MIT license file. I'm no expert on the legal situation here, but it seems to me that using the software under the terms of the MIT License is allowed and that this license does not require adding copyright information to our binaries. AFAICT, we don't include copyright info for any of our dependencies, only for software where the fish repo itself contains code derived from that software. |
|
Relevant issue: projectfluent/fluent-rs#31 |
|
It's not about including copyright information, it's that some of the dependencies for fluent-rs are still apache2-only. Apache2 is incompatible with GPLv2 because IIRC it includes a patent grant and the GPLv2 has a "no further restrictions" clause. So the combined product has a license that can't be followed. |
|
If that's the case, why can fluent-rs be MIT-licensed, but we can't use it under that license? |
|
Because the MIT license and the Apache license don't conflict (neither of them is "viral" the way the GPL is). The GPLv2 and the Apache license do, though. We can be using the fluent-rs crate under the MIT, but we can't be using some of its dependencies. Edit: The offending dependencies are: |
|
So MIT-licensed projects can use fluent-rs and its dependencies, including the apache-licensed ones, but we can't because fish is GPLv2 licensed? Should we ask the two apache-licensed projects about making their projects available under a license that allows us to use their software in fish? |
1f55024 to
00de838
Compare
Most of fluent-rs is already dual-licensed. This crate is not, which can make it harder for GPL-2-only projects to use fluent-rs. Fix that by allowing use under MIT lincense. All (or almost all?) nontrivial contributions seem to be from the same author, so this should be easy? Ref: projectfluent/fluent-rs#34 Ref: fish-shell/fish-shell#11928
Most of fluent-rs is already dual-licensed. This crate is not, which can make it harder for GPL-2-only projects to use fluent-rs. Fix that by allowing use under MIT lincense. All (or almost all?) nontrivial contributions seem to be from the same author, so this should be easy? Ref: projectfluent/fluent-rs#34 Ref: fish-shell/fish-shell#11928 Closes projectfluent#30
Most of fluent-rs is already dual-licensed. This crate is not, which can make it harder for GPL-2-only projects to use fluent-rs. Fix that by allowing use under MIT lincense. All (or almost all?) nontrivial contributions seem to be from the same author, so this should be easy? Ref: projectfluent/fluent-rs#34 Ref: fish-shell/fish-shell#11928 Closes projectfluent#30
|
Isn't self_cell Apache OR GPLv2? If so wouldn't it be compatible? |
|
Rebased on latest master and updated to remove the dependency on #12008. I haven't gotten around to addressing all the new review comments yet, so there aren't many interesting updates from the last push. |
ea30f95 to
face3a9
Compare
For paths embedded via `rust-embed`, we only need to rebuild on path changes if the files are actually embedded. For debug builds, data is read from the file system instead of being embedded, so we don't need to rebuild then. This is apparently broken on Cygwin, so we always rebuild there. See 012b507 (Workaround for embed-data debug builds on Cygwin, 2025-09-16) for details. To avoid having to remember and duplicate this logic for all embedded paths, extract it into the build helper. fish-shell#11928 (comment)
For paths embedded via `rust-embed`, we only need to rebuild on path changes if the files are actually embedded. For debug builds, data is read from the file system instead of being embedded, so we don't need to rebuild then. This is apparently broken on Cygwin, so we always rebuild there. See 012b507 (Workaround for embed-data debug builds on Cygwin, 2025-09-16) for details. To avoid having to remember and duplicate this logic for all embedded paths, extract it into the build helper. fish-shell#11928 (comment)
face3a9 to
cd43fd7
Compare
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable. Closes fish-shell#12125
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable. Closes fish-shell#12125
Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in #11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable. Closes #12125
cd43fd7 to
0966c7c
Compare
|
Rebased on latest master and reworked a bit. See PR description for outstanding work. Asan does not seem to like the intentional leaks for creating |
|
Maybe the first step is to resurrect |
Extract the language selection code from the gettext crate, and to a lesser extent from `src/localization/mod.rs` and put it into `src/localization/settings.rs`. No functional changes are intended. Aside from better separation of concerns, this refactoring makes it feasible to reuse the language selection logic for Fluent later on. Part of fish-shell#12190
Put the gettext-specific code into `localization/gettext`. Part of fish-shell#12190
This replaces `initialize_gettext`. It is only defined when the `localize-messages` feature is enabled, to avoid giving the impression that it does anything useful when the feature is disabled. With this change, Fluent will be initialized as well once it is added, without requiring any additional code for initialization. Closes fish-shell#12190
The extracted function takes the parts which are used by gettext-extract, as well as the upcoming fluent-extract, and puts it into its own crate. This will allow having simpler proc macros for both localization systems, since it minimizes duplicated code.
Add an implementation allowing to use Fluent for localization in Rust.
The goal of this code is to eventually replace gettext, at least for
localization in Rust, but possibly also in fish scripts.
To make use of the added code, the new `localize!` macro should be used.
It takes a `FluentID` as its first argument. This ID is
used in FTL files to identify messages. In Rust, it can be constructed
using the new `fluent_id!` macro, which takes a string literal, or by
directly passing a string literal to `localize!`.
If arguments should be passed to Fluent, this can be done via additional
arguments to `localize!`. Either, by passing `&FluentArgs` (a type from
the `fluent` crate), or by specifying the arguments as comma-separated
key-value pairs, where the key needs to be a string literal, and the
value needs to be a valid argument to `FluentArgs.set` (also from the
`fluent` crate). The following example demonstrates the syntax:
`localize!("some-id", string_arg = "a string", number = 42)`
The result will be a `String`, formatted according to the rules in the
relevant FTL file. On errors, this macro panics.
For setting the language precedence, the mechanism used for gettext is
employed. One notable difference is that, unlike with gettext, with
Fluent we don't have access to default message values, so instead we
unconditionally fall back to the `en` catalog, which must be complete.
Completeness of this catalog is checked statically using newly
introduced tests. These tests need to extract message IDs from the Rust
sources, which is done by compiling with the `fluent-extract` feature
enabled. To avoid recompilation, `build_tools/check.sh` caches the
extracted IDs. In CI, no caching mechanism exists, so there the test
will invoke cargo.
Translations are specified in the `localization/fluent` directory. For
each language, there is a file in Fluent's FTL format. The filename
indicates the language, and has the suffix `.ftl`, e.g. `en.ftl`.
At the moment, we only have the `en.ftl`, since no translations have
been added yet.
`rust-embed` is used to make these files available to the binary at
runtime. Files will only be parsed if they are specified in the
precedence list. (`en.ftl` is always parsed.)
The `fish-fluent-check` crate contains a binary which can be used to
check our FTL files. It uses the `fish-fluent-extract` crate, to extract
Fluent IDs from the Rust sources.
It panics if any of these properties are violated:
- `en.ftl` does not contain every ID contained in the sources.
- Any FTL file contains an ID not contained in the sources.
- Any FTL file is not sorted by ID.
These checks still need to be integrated into `check.sh` and CI.
These macros first localize a string, and then output it in different ways depending on the macro. Each macro comes with a version which adds a newline at the end.
It is added separately using a message which also exists in a regular test. The reason is that our message ID extraction does not work for doc tests, so our tooling would complain about an unused ID in the FTL file if we added it just for the doc test. By introducing it for a regular test and reusing the ID in the doc test, we avoid this problem.
This migrates the fish version info message from gettext to Fluent. It can be used to see Fluent-based localization in action. Because this commit adds new FTL files, these languages show up in the Fluent language precedence, requiring an update to the corresponding tests.
Reword zh_CN as suggested in fish-shell#11833 (comment) fish -c 'for LC_MESSAGES in fr_FR zh_CN zh_TW argparse h- end'
0966c7c to
398a9c6
Compare
First part of introducing fluent localization. Refer to the commit messages for details.
Stacked on #12190, #12208
TODOs:
ToFluentArgtrait to reduce the need for type conversion when using thelocalize!macro.fluent_syntax::parser::core::Parserpub projectfluent/fluent-rs#394. See Fluent FTL tools #12123localize!(e.g. for printing to stdout/stderr) actually useful?