Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filetypes.verilog: add Verilog-2005 keywords #4037

Merged
merged 1 commit into from
Nov 20, 2024

Conversation

cousteaulecommandant
Copy link
Contributor

filetypes.verilog only includes ancient Verilog-1995 keywords (plus signed and unsigned for some reason), but is missing plenty of the newer Verilog-2001 and the newest Verilog-2005 keywords, some of them very common, such as generate/endgenerate, localparam, automatic...

I have added all those "new" keywords to the word3= category to distinguish them from the "classic" keywords from the previous century, although I honestly don't know what's the difference between the two categories.

I have also moved all the keywords that used to be in word3= to word=, since I didn't see any reason to keep those keywords there (they seem to be related to "variable declarations" one way or another, but then again, so are many of the keywords listed in word=). This way, word= will be for the "old" keywords, and word3= for the "new" ones that "might not work in a Verilog tool made in the previous century".

Finally, I have added $ to the list of wordchars=, because Verilog is special and considers $ to be an identifier character like _ (so e.g. $finish and gotabout$350 are valid identifiers).

@cousteaulecommandant
Copy link
Contributor Author

(Note that this PR is unrelated to SystemVerilog; all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)

@cousteaulecommandant
Copy link
Contributor Author

As a side note, I see that words2= includes a bunch system tasks and functions ($display, $finish, etc.), but the list is not complete (see chapter 17 of the Verilog standard for a complete list; there are 122 in total).

Do you think it would be a good idea to add those as well?

(Personally I've never used most of those; the only few I've ever used were properly highlighted.)

Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.)

@techee techee added the filetype label Nov 9, 2024
@techee
Copy link
Member

techee commented Nov 9, 2024

@cousteaulecommandant Thanks. I think nobody from Geany developers uses Verilog so I guess we'll trust your choices :-).

I have added all those "new" keywords to the word3= category to distinguish them from the "classic" keywords from the previous century, although I honestly don't know what's the difference between the two categories.

I think it would be best to put the new keywords to the "word" list among the original keywords.

The only difference is the coloring of the various lists - see the mapping to the theme colors:

word=keyword_1
word2=keyword_2
word3=keyword_3

Finally, I have added $ to the list of wordchars=, because Verilog is special and considers $ to be an identifier character like _ (so e.g. $finish and gotabout$350 are valid identifiers).

Note that Geany currently doesn't respect the wordchars characters for everything and for some things the "old" hard-coded a-zA_Z0-9_ is still used. Should be fixed eventually.

(Note that this PR is unrelated to SystemVerilog; all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)

Just a note - the Verilog ctags parser contains also System Verilog parser. We could enable it if you add the SystemVerilog filetype.

As a side note, I see that words2= includes a bunch system tasks and functions ($display, $finish, etc.), but the list is not complete (see chapter 17 of the Verilog standard for a complete list; there are 122 in total).

Do you think it would be a good idea to add those as well?

Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.)

Depends whether Verilog users would expect them to be highlighted or not. The other "keyword" lists are used this way for other languages too but I don't know what's the common practice for Verilog.

@techee
Copy link
Member

techee commented Nov 9, 2024

I have also moved all the keywords that used to be in word3= to word=, since I didn't see any reason to keep those keywords there (they seem to be related to "variable declarations" one way or another, but then again, so are many of the keywords listed in word=)

Some of them appear to be builtin types so maybe it makes sense to keep them in word3 - other languages do something like that too. Not sure about reg wire input output inout though.

@techee
Copy link
Member

techee commented Nov 9, 2024

(Note that this PR is unrelated to SystemVerilog; all I added here is still plain old Verilog. I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)

There's also this PR you might want to check #1831 - if it's alright, maybe that one could be merged.

@cousteaulecommandant
Copy link
Contributor Author

The only difference is the coloring of the various lists - see the mapping to the theme colors:

Honestly this is why I was confused. In the default theme, both word and word3 are bold navy blue and therefore indistinguishable - in fact I didn't know Geany treated them differently for Verilog until I saw it in filetypes.verilog - whereas word2 is dark red. It makes sense because this way navy blue = keywords but dark red = "standard functions". But it doesn't make sense that some keywords are considered one type and others are another type; I couldn't figure out which criteria was used to make keywords "word" or "word3" (the latter seem to be "stuff for net/variable declaration", but then again so are many of the keywords declared in "word").

I have also moved all the keywords that used to be in word3= to word=

Some of them appear to be builtin types so maybe it makes sense to keep them in word3 - other languages do something like that too. Not sure about reg wire input output inout though.

All of them are "keywords", according to the standard (Annex B).
I tried to figure out a pattern (are they "types"? "Stuff used to declare variables"?) and had actually started making a list of which things were "variable stuff", moving not only reg and wire but also wand, wor, tri, tri0, tri1, realtype, signed, (unsigned), etc., but eventually I realized that (1) I don't know the standard well enough to see which of these make sense to be in a separate category of keywords; (2) the whole idea of "type" is a bit fuzzy, since everything in Verilog is basically "array of bits"; the stuff in that list is more like "qualifiers" mostly; (3) I have no idea what the original idea of having a separate list of keywords was meant for, since it's not mentioned anywhere, so I can't decided what should go in there and what shouldn't; and (4) overall I see no point in having two lists of keywords (neither C nor C++ seem to have those so I can't think of an analogy). So I thought it would be easiest to just put everything in a single category of keywords.

The only thing about "different" keywords I could find was in section 19.11 of the standard, which states that you can actually "disable" all of the new keywords added in Verilog 2001 and 2005, so if your design is super old and uses variable names that have later become reserved keywords, you can disable those so that you can use them as regular identifiers.
In this scenario, it might be useful to highlight these in a different color, since they're "keywords that can be forced to stop being keywords".
And that's why I thought it might be a good idea to put them in the (now empty) word3= category.

Note that Geany currently doesn't respect the wordchars characters for everything and for some things the "old" hard-coded a-zA_Z0-9_ is still used. Should be fixed eventually.

Honestly I don't quite understand what wordchars does, but it felt right to add the $ there.

Do you think it would be a good idea to add those as well?

Note that these are not "keywords" per se, just system functions, but I suppose it's OK to handle them as "keywords" for the purpose of highlighting. (They're highlighted in a different color from actual keywords, which is the important thing.)

Depends whether Verilog users would expect them to be highlighted or not. The other "keyword" lists are used this way for other languages too but I don't know what's the common practice for Verilog.

Personally I think it's good to see $finish and $display etc highlighted as "this is something important", despite them not being keywords strictly speaking, so the current behavior is good. My question was whether to add the rest of functions starting with $ that are in the standard. I have no idea what most of those functions do, to be honest.

Just a note - the Verilog ctags parser contains also System Verilog parser. We could enable it if you add the SystemVerilog filetype.

That's the plan, yes. The Verilog lexer also seems to support SystemVerilog, so I've got the two hardest parts covered. I already have a commit that adds the language, and a PR almost ready.

I still plan to create a commit adding support for SystemVerilog, but that'll be on a different PR.)

There's also this PR you might want to check #1831 - if it's alright, maybe that one could be merged.

That one only adds keywords though; proper handling of SystemVerilog may require a bit more. (And the parser and the lexer are already made so why not?)

@techee
Copy link
Member

techee commented Nov 11, 2024

I tried to figure out a pattern (are they "types"? "Stuff used to declare variables"?) and had actually started making a list of which things were "variable stuff", moving not only reg and wire but also wand, wor, tri, tri0, tri1, realtype, signed, (unsigned), etc., but eventually I realized that (1) I don't know the standard well enough to see which of these make sense to be in a separate category of keywords; (2) the whole idea of "type" is a bit fuzzy, since everything in Verilog is basically "array of bits"; the stuff in that list is more like "qualifiers" mostly; (3) I have no idea what the original idea of having a separate list of keywords was meant for, since it's not mentioned anywhere, so I can't decided what should go in there and what shouldn't; and (4) overall I see no point in having two lists of keywords (neither C nor C++ seem to have those so I can't think of an analogy). So I thought it would be easiest to just put everything in a single category of keywords.

It's really up to the contributor of filetype support. For instance, java uses

primary=abstract assert break case catch class const continue default do else enum exports extends final finally for goto if implements import instanceof interface module native new non-sealed open opens package permits private protected provides public record requires return sealed static strictfp super switch synchronized this throw throws to transient transitive try uses var volatile when while with yield true false null
secondary=boolean byte char double float int long short void
# documentation keywords for javadoc
doccomment=author deprecated exception param return see serial serialData serialField since throws todo version
typedefs=

Those "secondary" keywords are normal keywords, just defining primitive types and are highlighted differently which kind of makes sense. But if you feel there's no such analogy in Verilog, it's probably best to have them all in one group.

Honestly I don't quite understand what wordchars does, but it felt right to add the $ there.

I have kind of the same feeling ;-). See #4038

Personally I think it's good to see $finish and $display etc highlighted as "this is something important", despite them not being keywords strictly speaking, so the current behavior is good. My question was whether to add the rest of functions starting with $ that are in the standard. I have no idea what most of those functions do, to be honest.

Then I'd say leave it as it is.

@eht16
Copy link
Member

eht16 commented Nov 12, 2024

Regarding the keywords:
as @techee said, the mapping of the keyword meaning in Geany to the one in Scintilla is sometimes kind of arbitrary.

While you are at, Scintilla supports (in the meantime) even more keyword lists: https://github.com/ScintillaOrg/lexilla/blob/master/lexers/LexVerilog.cxx#L1076

Maybe it also helps to have a look at the SciTE configuration:
https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/verilog.properties#l120

@cousteaulecommandant
Copy link
Contributor Author

Those "secondary" keywords are normal keywords, just defining primitive types and are highlighted differently which kind of makes sense. But if you feel there's no such analogy in Verilog, it's probably best to have them all in one group.

There is, but it's complicated. Basically there are a few "data types" proper (think C's char, int, float), and a ton of modifiers/specifiers describing how the signal behaves "physically" (think static, const, volatile, extern, etc). But I think I can make a distinction between "keywords that are used to declare signals/variables/constants" and "keywords that are not used to declare signals/variables/constants". If that's common practice I can give it a try; I think it can be done.

Option 2 is what I have now - separate keywords in "old keywords" and "newer keywords from a newer standard that a very old file might be using for something else", but I don't think that'll be that much useful.

Option 3 is cram everything into a single category, which I think is what C does. In any case, the default theme uses the same color for both so it's hard for me to see the difference.

whether to add the rest of functions starting with $ that are in the standard. I have no idea what most of those functions do, to be honest.

Then I'd say leave it as it is.

On second thought, I think I'll better add those as well. There are a few I use often that aren't listed ($clog2 and $signed are super common, for example), and there's a handy list of all these "system tasks and functions that are considered part of the Verilog HDL" in the standard so this is a no-brainer.


While you are at, Scintilla supports (in the meantime) even more keyword lists: https://github.com/ScintillaOrg/lexilla/blob/master/lexers/LexVerilog.cxx#L1076

Those look pretty much like what we already have though.

  • Primary keywords and identifiers -> word=
  • Secondary keywords and identifiers -> word3=
  • System Tasks -> word2=
  • User defined tasks and identifiers -> guess this is meant to allow users to add their own keywords (or to detect them automatically from the files?)
  • Documentation comment keywords -> I guess this is docComment= (although Geany doesn't seem to support special /// or /** comments in Verilog like it does for C, so no @todo, \param, etc.)
  • Preprocessor definitions -> this is the only one that I'm not sure about. Not sure if this refers to macro definitions (`define A_MACRO 42) or macros per se (x = `A_MACRO;), or "preprocessor stuff in general" (`whatever), or "specifically the macros the user has defined", or "only the standard macros".

Maybe it also helps to have a look at the SciTE configuration: https://sourceforge.net/p/scintilla/scite/ci/default/tree/src/verilog.properties#l120

Apparently they cram all the keywords as keywords1 and leave keywords2 blank (my "Option 3"), keywords3 is the $ stuff (word2=) but more complete, keywords4 is blank as well, and keywords5 is not used for "documentation comments" but for "pragma comments" (like //synopsys translate_off), which is a nonstandard feature used by some implementations.

@cousteaulecommandant
Copy link
Contributor Author

For the sake of "keeping the things the way they were", I've gone with option 1, and tried my best at grouping the keywords into things that are used for declaring signals/constants, and things that are not. I think the result is good.

I have also decided to add all the standard system functions/tasks.

This is how it looks now (with word3 highlighted in a lighter blue). Signal declarations use keywords of one type, other control structures use the other. I think it looks nice.
geany_verilog_keywords

I've pushed the new version to this PR (and rebased it on top of master, while I was at it).

@techee
Copy link
Member

techee commented Nov 16, 2024

Apart from the wordchar comment, this PR looks good to me.

@techee techee mentioned this pull request Nov 16, 2024
12 tasks
- Add Verilog-2001 and Verilog-2005 keywords to filetypes.verilog
  (right now only Verilog-1995 keywords are highlighted).
  These keywords were already included in ctags.
- Add all System tasks and functions as of Verilog-2005 to `word2` list
  (right now only a handful appear; perhaps from an older standard).

Details on added/removed keywords:

Add new keywords from Verilog-2001 and Verilog-2005 standard (Annex B)
(right now only Verilog-1995 keywords are listed):

    automatic generate endgenerate genvar localparam uwire
    config endconfig cell design instance liblist use
    library incdir include
    pulsestyle_ondetect pulsestyle_onevent showcancelled noshowcancelled

Remove invalid Verilog keywords:

    attribute endattribute strength @

Put all "keywords used to declare nets/variables/constants/events"
in `word3` list; leave rest in `word`.

Add all functions/tasks listed under "System tasks and functions"
in the Verilog-2005 standard (chapter 17).
@cousteaulecommandant cousteaulecommandant changed the title filetypes.verilog: add Verilog-2005 keywords and $ character filetypes.verilog: add Verilog-2005 keywords Nov 16, 2024
cousteaulecommandant added a commit to cousteaulecommandant/geany that referenced this pull request Nov 17, 2024
Moved all SystemVerilog keywords that are used when declaring signals
to a separate `word3` category, as it was done for Verilog in PR geany#4037.

The list has been obtained by picking all the keywords marked as
K_(EVENT|REGISTER|LOCALPARAM|PORT|PARAMETER|CONSTANT|SPECPARAM|NET)
in ctags/parsers/verilog.c, extended with additional keywords from
SystemVerilog-2017 section 6.7 "Net declarations" (as it was done for
Verilog in PR geany#4037), and with `signed` and `unsigned`.

The result is a superset of the `word3` category for Verilog,
extended with some keywords such as `logic`, `int`, `ref`, etc.
cousteaulecommandant added a commit to cousteaulecommandant/geany that referenced this pull request Nov 17, 2024
Moved all SystemVerilog keywords that are used when declaring signals
to a separate `word3` category, as it was done for Verilog in PR geany#4037.

The list has been obtained by picking all the keywords marked as
K_(EVENT|REGISTER|LOCALPARAM|PORT|PARAMETER|CONSTANT|SPECPARAM|NET)
in ctags/parsers/verilog.c, extended with additional keywords from
SystemVerilog-2017 section 6.7 "Net declarations" (as it was done for
Verilog in PR geany#4037), and with `signed` and `unsigned`.

The result is a superset of the `word3` category for Verilog,
extended with some keywords such as `logic`, `int`, `ref`, etc.
cousteaulecommandant added a commit to cousteaulecommandant/geany that referenced this pull request Nov 18, 2024
Moved all SystemVerilog keywords that are used when declaring signals
to a separate `word3` category, as it was done for Verilog in PR geany#4037.

The list has been obtained by picking all the keywords marked as
K_(EVENT|REGISTER|LOCALPARAM|PORT|PARAMETER|CONSTANT|SPECPARAM|NET)
in ctags/parsers/verilog.c, extended with additional keywords from
SystemVerilog-2017 section 6.7 "Net declarations" (as it was done for
Verilog in PR geany#4037), and with `signed` and `unsigned`.

The result is a superset of the `word3` category for Verilog,
extended with some keywords such as `logic`, `int`, `ref`, etc.
cousteaulecommandant added a commit to cousteaulecommandant/geany that referenced this pull request Nov 18, 2024
Moved all SystemVerilog keywords that are used when declaring signals
to a separate `word3` category, as it was done for Verilog in PR geany#4037.

The list has been obtained by picking all the keywords marked as
K_(EVENT|REGISTER|LOCALPARAM|PORT|PARAMETER|CONSTANT|SPECPARAM|NET)
in ctags/parsers/verilog.c, extended with additional keywords from
SystemVerilog-2017 section 6.7 "Net declarations" (as it was done for
Verilog in PR geany#4037), and with `signed` and `unsigned`.

The result is a superset of the `word3` category for Verilog,
extended with some keywords such as `logic`, `int`, `ref`, etc.
@cousteaulecommandant
Copy link
Contributor Author

...Sorry; I didn't realize I have to RESOLVE the code reviews after I fix them. 😅

@techee
Copy link
Member

techee commented Nov 20, 2024

...Sorry; I didn't realize I have to RESOLVE the code reviews after I fix them. 😅

Nah, it doesn't matter :-)

Alright, let's merge this one too. Thanks for your contributions!

@techee techee merged commit 382776f into geany:master Nov 20, 2024
4 checks passed
@b4n b4n added this to the 2.1 milestone Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants