Skip to content

Conversation

@danielrainer
Copy link

Refactoring to prepare for Fluent. Still needs a bit of cleanup before merging.

@krobelus krobelus added this to the fish 4.3 milestone Dec 19, 2025
type Catalog = &'static phf::Map<&'static str, &'static str>;

pub struct SetLanguageLints<'a> {
pub duplicates: Vec<&'a str>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW I think it would be fine to put various unrelated commits into a PR to avoid trivial conflicts,
if it's easy to merge leading commits.
(Like the wchar prelude the module extraction ones though it's not really a relevant problem yet).

I'll merge the first commit from this one, since it seems ready (?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I create stacked PRs, it's usually to indicate that merging certain sets of leading commits is reasonable. It also allows merging via GitHub's CI. I usually go for parallel PRs if the changes are sufficiently unrelated, since that allows merging in any order. This approach sometimes leads to merge conflicts, but it prevents PRs from being blocked on unrelated changes, which I think is better than having long stacks of unrelated changes.

Here, merging the first commit is fine. I was considering putting the gettext stuff into a submodule, but that can still be done later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, your approach is superior in general, at least when there is peer-review

krobelus pushed a commit that referenced this pull request Dec 19, 2025
Localization deserves its own module. As a first step, this module is
created here. This will be followed up by significant refactoring in
preparation for supporting Fluent alongside gettext.

`localization/mod.rs` is used instead of `localization.rs` because it is
planned to split this module into submodules.

Part of #12190
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
Extract the language selection code from the gettext crate, and to a
lesser extent from `src/localization/mod.rs` and put it into
`src/localization/settings.rs`. No functional changes are intended.

Aside from better separation of concerns, this refactoring makes it
feasible to reuse the language selection logic for Fluent later on.

Part of fish-shell#12190
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
Put the gettext-specific code into `localization/gettext`.

Part of fish-shell#12190
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
This replaces `initialize_gettext`. It is only defined when the
`localize-messages` feature is enabled, to avoid giving the impression
that it does anything useful when the feature is disabled.

With this change, Fluent will be initialized as well once it is added,
without requiring any additional code for initialization.

Closes fish-shell#12190
@danielrainer
Copy link
Author

The code is now in a state where it is compatible with the latest version of the remaining Fluent code (which still needs to be published).

I might try to extract more code into separate crates to allow localizing messages outside of the main crate. Maybe only for Fluent, to give some incentive to port messages when splitting up the main crate. I'll need to spend some more time with the code to decide on this. If the potential additional code movement should be kept out of master's history, it might be good to wait with merging this. Otherwise, this is ready for review.

@danielrainer danielrainer marked this pull request as ready for review December 25, 2025 02:07
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
Extract the language selection code from the gettext crate, and to a
lesser extent from `src/localization/mod.rs` and put it into
`src/localization/settings.rs`. No functional changes are intended.

Aside from better separation of concerns, this refactoring makes it
feasible to reuse the language selection logic for Fluent later on.

Part of fish-shell#12190
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
Put the gettext-specific code into `localization/gettext`.

Part of fish-shell#12190
danielrainer pushed a commit to danielrainer/fish-shell that referenced this pull request Dec 25, 2025
This replaces `initialize_gettext`. It is only defined when the
`localize-messages` feature is enabled, to avoid giving the impression
that it does anything useful when the feature is disabled.

With this change, Fluent will be initialized as well once it is added,
without requiring any additional code for initialization.

Closes fish-shell#12190
@danielrainer
Copy link
Author

In case you wonder about the function-local macros, these are there to minimize changes when adding Fluent code. I wanted a type-safe interface for setting the languages per localization system, so gettext and Fluent can use different sets of languages and there is no risk of trying to set a language for one system which only exists in the other (or in neither). The original plan was to define an interface which both GettextLocalizationLanguage and the upcoming FluentLocalizationLanguage implement, and then have a collection containing pairs of the getter and setter functions for each localization system. I didn't manage to come up with an implementation that the type checker agreed with, and I don't know if it's even possible, so I went with the uglier macro-based approach (or just repetition in cases where the code is simple enough that the syntactic overhead of a macro is unjustified).


pub struct SetLanguageLints<'a> {
pub duplicates: Vec<&'a str>,
pub non_existing: Vec<&'a str>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll need to spend some more time with the code to decide on
this. If the potential additional code movement should be kept
out of master's history, it might be good to wait with merging
this. Otherwise, this is ready for review.

I don't think anyone else is working on these files, so there's no rush,
and keeping noise out of history is always good. But if it helps
you, we can expedite merging this.

It's not completely trivial to review because there are
changes that are not pure code movement;
but it was actually reasonably easy with --color-moved=zebra

I'm reading that

  • the two struct SetLanguageLints definitions are consolidated,
  • InternalLocalizationState/PublicLocalizationState is removed, replaced by LOCALIZATION_STATE and LANGUAGE_PRECEDENCE
    I guess that's a removal of an abstraction to simplify fluent implementation;
    if the abstraction is not used much, that sounds reasonable.
  • AVAILABLE_LANGUAGES is extracted (used to be computed inline).
    This makes sense because we want to hide gettext-specific things behind functions.

This seems fine, especially given that there are basically no functional changes.

};

fish::localization::initialize_gettext();
#[cfg(feature = "localize-messages")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's an argument to "pushing down" any #[cfg()] directives
as far as possible (for us that means: to the calls to fluent APIs
which are conditionally available), but that might be inconvenient
and I don't super care, especially since this is a default-enabled
and purely additive feature.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The advantage of not pushing this down is that it is obvious when reading the code that nothing will happen if localization is disabled. How valuable this actually is is debatable. There might also be some slight benefits in build times.

If we make initialize_localization unconditionally available, then the same should probably be done for the other functions in src/localization/mod.rs. Right now, the settings submodule assumes that it is only used when localization is active. Changing this would require putting a lot of cfgs into settings which would make the code harder to work with. So in this case, I think leaving the cfgs where they are is the better option.

all_available_langs.insert(lang);
}
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and then have a collection containing pairs of the getter and
setter functions

Right. Iterating over a heterogeneous collection (of getters/setters)
would require either a Box<dyn sometrait>,
(which would be overkill here).
Sometimes we can make things homogeneous by converting them to concrete types like
Vec<&str>, but here the best solution is probably like your macro approach,
except using generic functions as macros.
Then we can invoke the generic function once for fluent and for gettext.
Unfortunately lambdas don't support impl Trait (yet?), but
passing the captured items explicitly is a small price to pay for better r-a support.

Something like this (it should work similarly for the other macros):

diff --git a/src/localization/settings.rs b/src/localization/settings.rs
index 3730772bfe..ca0274e79e 100644
--- a/src/localization/settings.rs
+++ b/src/localization/settings.rs
@@ -255,15 +255,26 @@
                 duplicates.push(lang)
             }
         }
+        pub fn fluent_get_available_languages_ids() -> impl Iterator<Item = &'static str> {
+            ["foo", "bar"].iter().copied()
+        }
         let mut all_available_langs = HashSet::new();
-        macro_rules! get_langs {
-            ($backend:tt) => {
-                for &lang in $backend::get_available_languages().keys() {
-                    all_available_langs.insert(lang);
-                }
-            };
+        fn get_langs(
+            all_available_langs: &mut HashSet<&'static str>,
+            avail: impl Iterator<Item = &'static str>,
+        ) {
+            for lang in avail {
+                all_available_langs.insert(lang);
+            }
         }
-        get_langs!(fish_gettext);
+        get_langs(
+            &mut all_available_langs,
+            fish_gettext::get_available_languages().keys().copied(),
+        );
+        get_langs(
+            &mut all_available_langs,
+            fluent_get_available_languages_ids(),
+        );
         let mut non_existing: Vec<&str> = seen.difference(&all_available_langs).copied().collect();
         non_existing.sort();

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iterating over a heterogeneous collection (of getters/setters)
would require either a Box<dyn sometrait>

I tried that, but without success.

For the macros using set_language_precedence, we would have to introduce a new LocalizationLanguage trait, which needs to be available to the gettext and Fluent crates, so it should be in a crate without any local dependencies. I went with this approach when trying to get heterogeneous collections to work, by creating a new crate. Do you prefer this compared to the current macro-based implementation?

}
get_langs!(fish_gettext);
let mut non_existing: Vec<&str> = seen.difference(&all_available_langs).copied().collect();
non_existing.sort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first commit adds a sneaky sort() here (as demonstrated by the test changes).
I agree that we want to sort the list of available languages,
but for the ones with a user-specified order like non_existing I'd have expected we preserve order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It simplifies the implementation a bit to use sets here, but of course that means that the original order is lost. I think that for a given input, the output should be consistent, which is why the sort is there. I don't consider it important that the order is the same as the one given by the user, but if you prefer it that way we can change it. Do you think non-existing entries should be de-duplicated in the error message then?

@krobelus krobelus modified the milestones: fish 4.3, fish-future, fish Dec 28, 2025
Daniel Rainer added 3 commits January 2, 2026 01:12
Extract the language selection code from the gettext crate, and to a
lesser extent from `src/localization/mod.rs` and put it into
`src/localization/settings.rs`. No functional changes are intended.

Aside from better separation of concerns, this refactoring makes it
feasible to reuse the language selection logic for Fluent later on.

Part of fish-shell#12190
Put the gettext-specific code into `localization/gettext`.

Part of fish-shell#12190
This replaces `initialize_gettext`. It is only defined when the
`localize-messages` feature is enabled, to avoid giving the impression
that it does anything useful when the feature is disabled.

With this change, Fluent will be initialized as well once it is added,
without requiring any additional code for initialization.

Closes fish-shell#12190
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants