Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regroup filetypes by letter #3977

Merged
merged 2 commits into from
Nov 20, 2024
Merged

Regroup filetypes by letter #3977

merged 2 commits into from
Nov 20, 2024

Conversation

techee
Copy link
Member

@techee techee commented Oct 6, 2024

This patch converts the currently used groups like "Programming languages", "Scripting languages", etc. to groups based on the starting letter of the language only. There are two main reasons for this change:

  1. Some languages are hard to categorize by some semantic group name and the group names are not really fitting. In addition, the currently used group name "Programming languages" isn't very good as "Scripting languages" are also a subset of programming languages. On the other hand it's hard to find a good substitute for "Programming languages" - mostly these are "Compiled languages" but not always and some languages allow to be both interpreted and compiled which complicates the situation.
  2. The "Programming languages" group is too big and the menu is so long that it doesn't fit the display on smaller screens and one has to scroll the menu to get to the right item which isn't user friendly. Things will get only worse as there are still many "Programming languages" that Geany does not support yet and that might be added to the editor in the future.

The newly introduced alphabetic groups are:

A-B
C
D-E-F
G-H-I
J-K-L
M-N-O
P-Q
R-S
T-U-V-W
X-Y-Z

These allow roughly even distribution of existing languages into smaller groups with enough space for possible future language additions. While it would be possible to make the group names more symmetrical, e.g. by having "R-S-T", "U-V-W", I found that the asymmetry helps quicker navigation as one remembers the group with his favorite language is e.g. "the one before the long group" without thinking where exactly in the alphabet the letter is.

Some notes to the implementation:

  1. It mostly follows the existing implementation trying to do minimal changes and doing things in a "dumb and straightforward way". This means that group names are hard coded (they could also be autogenerated, possibly auto-attempting to distribute languages into evenly sized groups).
  2. Technically this change breaks API as it modifies GeanyFiletypeGroupID which is used for the group member of GeanyFiletype which is accessible to plugins. However, this member isn't documented to plugins and no existing plugin from geany-plugins uses it so probably not a big problem.
  3. Because grouping happens automatically now, the [Groups] section from filetype_extensions.conf can be removed and is not read any more.
  4. Because grouping happens automatically now, the [5] argument from FT_INIT() can be removed.
  5. In addition, this patch also removes the [4] argument from FT_INIT() which determined the suffix in the filetype menu like "C++ source file" - IMO the "source file", "file", etc. suffix for all the languages in the menu introduced just a visual clutter and made legibility worse. In addition with the removal of [Groups] from filetype_extensions.conf in (3), it would not be possible to determine the right suffix for custom file types.
  6. The newly introduced groups are untranslatable strings - there should be no need to translate those.

For some more context, see #3938 (comment) and below.

A few screenshots with the new grouping:

Screenshot 2024-10-06 at 14 35 12 Screenshot 2024-10-06 at 14 36 18 Screenshot 2024-10-06 at 14 37 18

@techee techee mentioned this pull request Oct 6, 2024
9 tasks
@elextr
Copy link
Member

elextr commented Oct 7, 2024

Will hopefully inspect and maybe even try "soon" but am re-building my development machine after an SSD death, and then will have to catch up the lost time first, but a couple of comments.

As said elsewhere, (many times over years) these days the distinction of programming vs script vs AOT vs JIT vs interpretor are meaningless for the language, they apply to the implementation, and in some cases there are multiple implementations, so those categories in the menu are meaningless for the filetype.

  1. agree the menu should be hard coded. Some automajik method that totally reorganised the menu just because a user tried a custom filetype would be very annoying.
  2. As the structure member is not individually documented its not in the API so any plugin that uses them is its own fault, and so long as the new member is the same size it won't change the ABI either and so no problems.
  3. agree
  4. agree
  5. personally I don't care about "source file" etc. I only care about the filetype name. Geany does not edit anything but (UTF-8) text and it won't ever edit anything but a "file" so never eg "C++ object file" so why say "C++ source file". Lets see if anyone else who cares can make a cogent argument for keeping the extra text.
  6. agree

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

@techee
Copy link
Member Author

techee commented Oct 11, 2024

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

What happens right now is that those languages starting with a non-ASCII letter are placed to the top-level menu and not within the A-Z submenus. But, yes, some "Other" would probably be better.

@techee
Copy link
Member Author

techee commented Oct 16, 2024

There's also some more discussion in #2087 which I just found. I still think alphabetical sorting is the easiest to understand rather than some "Pascal-like" or "Python-like" groups.

@elextr
Copy link
Member

elextr commented Oct 16, 2024

Agree that alphabetic is the only sensible default menu division. To illustrate categories, look at the Wikipedia categorical section, and look at how many categories each language appears in. It is meaningless. The Geany team should be humble enough to not be programming language wizards and enforce some categorisation, just use alphabetic.

If somebody is soooo convinced that they need non-alphabetical they can make a separate PR built on this that reads a conf file to replace alphabetic and they can do whatever they want.

@techee
Copy link
Member Author

techee commented Nov 2, 2024

Speculatively, perhaps there should be an "Other" for non-ASCII names, maybe the new language is named "Åland" for example.

Done in the latest commit.

@elextr
Copy link
Member

elextr commented Nov 2, 2024

LGBI, will try to make time to test in a few days, but if others test it don't wait for me.

@cousteaulecommandant
Copy link
Contributor

Looks good. I kinda liked the idea of separating languages by categories but honestly that simply wasn't feasible.

Question. How is this handled in different locales? Will the types be in the English placement, or in the localized one?

By removing the "source file" suffix you avoided most of the trouble this could cause (e.g. in Spanish nearly all languages are called "Archivo de fuente <type>" or "Archivo <type>", and they'd all be filed under A), but there are still a few localized names (e.g. "Cascading Stylesheet" = "Hoja de estilo en cascada"). Probably the most reasonable approach is to not translate the names, and to use "short names" (e.g. "CSS" rather than "Cascading Stylesheet") in all locales.

@techee
Copy link
Member Author

techee commented Nov 17, 2024

Question. How is this handled in different locales? Will the types be in the English placement, or in the localized one?

It will be the localized one.

By removing the "source file" suffix you avoided most of the trouble this could cause (e.g. in Spanish nearly all languages are called "Archivo de fuente <type>" or "Archivo <type>", and they'd all be filed under A)

Even if the prefix/suffix stayed there, it would be grouped by <type>.

but there are still a few localized names (e.g. "Cascading Stylesheet" = "Hoja de estilo en cascada"). Probably the most reasonable approach is to not translate the names, and to use "short names" (e.g. "CSS" rather than "Cascading Stylesheet") in all locales.

Yes, I was thinking about this too. The following language names are translatable now:

  • Shell (possibly keep untranslated)
  • Makefile (possibly keep untranslated)
  • Cascading Stylesheet (could become CSS)
  • Config (could become Ini/conf)
  • Gettext translation (???)

So it means these could appear under different letters depending on the used locale. And as you said, at least for CSS, it would be better to use "CSS" (not sure how common it's to translate Shell or Makefile).

@cousteaulecommandant
Copy link
Contributor

I would've thought "Assembly" would've been translatable too ("Ensamblador" in Spanish), but at least in Geany it's not translated.
I suppose "Gettext translation" is translatable because of the "translation" part in the name. No idea about "Makefile". (And I don't think "shell" is meant to be translated as "cáscara" in Spanish) :)

Cascading Stylesheet should probably be left as "CSS", and Gettext translation as "Gettext". Just the language. (About "config"… yeah that's hard since I'm not even sure it has a standard name, and there are billions of slightly different implementations.)

Even if the prefix/suffix stayed there, it would be grouped by <type>.

OK, I probably confused "internal name" and "name displayed on the menu".
(Or they're grouped by the untranslated name, even if they're displayed by the translated one, which is what I meant with my first question.)

@techee
Copy link
Member Author

techee commented Nov 17, 2024

(Or they're grouped by the untranslated name, even if they're displayed by the translated one, which is what I meant with my first question.)

There are 2 types of translatable strings. One denoting the kind of file

  • _("%s source file")
  • _("%s file")
  • _("%s script")
  • _("%s document")

and then the actual translation of the language which is inserted inside one of the above 4 strings (i.e. those 5 languages I mentioned above).

The placement into the alphabetic groups is based on the translation of the language, not the translation of the kind of file (which is gone with this patch anyway)

@eht16
Copy link
Member

eht16 commented Nov 18, 2024

Looks good to me, tested with English and German locale (for German "Config" is actually translated to "Konfigurationsdatei".

I noticed one issue in the Tools->Configuration Files->Filetype Configurationmenu, there is filetypes.conf listed under K which might be unexpected.

So, I'd also vote for making those five filetype names untranslatable and rename them accotding to the suggestion in #3977 (comment).

@techee
Copy link
Member Author

techee commented Nov 18, 2024

I noticed one issue in the Tools->Configuration Files->Filetype Configurationmenu, there is filetypes.conf listed under K which might be unexpected.

Yeah, good point, done.

Also, the human-readable name (of those several filetypes that contain it) should start by the same letter as the filetype name for the very same reason. So

  • for "Po" I used "Po (Gettext)"
  • for "Conf" I used "Conf/Ini"
  • and changed "(O)Caml" to "Caml/OCaml" (thanks to which I could remove a special hack used to handle the braces)

@techee
Copy link
Member Author

techee commented Nov 18, 2024

I've also updated the documentation (i.e. removed stuff related to filetype group configuration). I hope I haven't missed anything. There don't seem to be any screenshots showing the groups so no need to update those.

@eht16
Copy link
Member

eht16 commented Nov 19, 2024

The latest changes look good and especially Conf/Ini works well.

After resolving the merge conflict, I think we are good to merge.

@techee techee force-pushed the filetype_groups branch 2 times, most recently from 7b05c95 to 7e3da8c Compare November 19, 2024 21:53
@techee
Copy link
Member Author

techee commented Nov 19, 2024

After resolving the merge conflict, I think we are good to merge.

Done. For merging, I'd just squash the commits into one.

This patch converts the currently used groups like "Programming languages",
"Scripting languages", etc. to groups based on the starting letter of
the language only. There are two main reasons for this change:

1. Some languages are hard to categorize by some semantic group name and
the group names are not really fitting. In addition, the currently
used group name "Programming languages" isn't very good as "Scripting
languages" are also a subset of programming languages. On the other
hand it's hard to find a good substitute for "Programming languages" -
mostly these are "Compiled languages" but not always and some languages
allow to be both interpreted and compiled which complicates the situation.
2. The "Programming languages" group is too big and the menu is so long
that it doesn't fit the display on smaller screens and one has to scroll
the menu to get to the right item which isn't user friendly. Things will
get only worse as there are still many "Programming languages" that
Geany does not support yet and that might be added to the editor in the
future.

The newly introduced alphabetic groups are:
A-B
C
D-E-F
G-H-I
J-K-L
M-N-O
P-Q
R-S
T-U-V-W
X-Y-Z

These allow roughly even distribution of existing languages into smaller
groups with enough space for possible future language additions. While
it would be possible to make the group names more symmetrical, e.g. by
having "R-S-T", "U-V-W", I found that the asymmetry helps quicker
navigation as one remembers the group with his favorite language is
e.g. "the one before the long group" without thinking where exactly in
the alphabet the letter is.

Some notes to the implementation:
1. It mostly follows the existing implementation trying to do minimal
changes and doing things in a "dumb and straightforward way". This means
that group names are hard coded (they could also be autogenerated,
possibly auto-attempting to distribute languages into evenly sized
groups).
2. Technically this change breaks API as it modifies GeanyFiletypeGroupID
which is used for the group member of GeanyFiletype which is accessible
to plugins. However, this member isn't documented to plugins and
no existing plugin from geany-plugins uses it so probably not a big
problem.
3. Because grouping happens automatically now, the [Groups] section
from filetype_extensions.conf can be removed and is not read any more.
4. Because grouping happens automatically now, the [5] argument from
FT_INIT() can be removed.
5. In addition, this patch also removes the [4] argument from FT_INIT()
which determined the suffix in the filetype menu like
"C++ source file" - IMO the "source file", "file", etc. suffix for all
the languages in the menu introduced just a visual clutter and made
legibility worse. In addition with the removal of [Groups] from
filetype_extensions.conf in (3), it would not be possible to determine
the right suffix for custom file types.
6. The newly introduced groups are untranslatable strings - there should
be no need to translate those.
7. If users create a non-builtin filetype starting with a non-alphabetic
character, it will be placed into a group called Other. This group won't
be displayed if there are no such filetype.
8. Human-readable names of languages are converted to non-translatable
strings because otherwise their placement into groups may be confusing.
Also, these names should start with the same letter like the language
name itself otherwise the grouping in Tools->Configuration Files->
Filetype Configuration might not correspond to the language name.
@techee techee merged commit 4c1191a into geany:master Nov 20, 2024
7 checks passed
@b4n b4n added this to the 2.1 milestone Nov 20, 2024
techee added a commit to techee/geany that referenced this pull request Nov 24, 2024
Accidentally removed in geany#3977.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants