Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a meta-language for formal definitions #69

Merged
merged 12 commits into from
Jan 27, 2025

Conversation

zerbina
Copy link
Collaborator

@zerbina zerbina commented Nov 12, 2024

Summary

Implement a meta-language for describing languages' abstract syntax and
semantics, using a macro DSL. The source language's textual definition
is replaced with a definition using the new meta-language.

Details

The current formal definition has a few problems:

  • it's not accessible to the computer, preventing mechanization and
    thus testing/validation
  • the lack of any form of testing makes it very easy to introduce
    mistakes (both syntactic and semantic)
  • the lack of typesetting makes it hard to read for a human reader

Using a dedicated meta-language is meant to address the problems listed
above. A macro DSL is used instead of a fully-custom language, which
has the following benefits:

  • interop with NimSkull is easy, if needed
  • the syntax can be built on that of NimSkull without needing a
    dedicated parser
  • the output of the macro is readily available at compile-time, making
    macro-based code generation possible
  • the build process stays simple

The language macro only constructs a Language instance, which is
meant to be passed to further processing (e.g., queries, code
generation, rendering, etc.).

The textual definition is replaced with a meta-language equivalent.
Some small adjustments/clarification are made where needed:

  • the l and ch non-terminals; they need a concrete definition
  • the E-builtin-readFile rule uses a placeholder definition, as
    indeterminate values cannot be described with the meta-language at
    the moment

Substituting identifiers with values (previously denoted by e[x/v])
is now fully defined using the substitute meta-function.


To-Do

  • start over with the meta-language; the current one is too limited
  • replace the source language definition with one using the meta-language
  • implement a MarkDown (or RST) renderer for languages (postponed)
  • integrate building/rendering the language definition into koch/CI
  • write a proper commit message

Notes For Reviewers

  • please take a look at the old and new definition side-by-side to make sure there really are no translation errors (I already read over it a few times, but it's possible that I missed some differences)
  • I think the meta-language is solid enough for the foreseeable future, but if you have some suggestions for improvement and/or simplifications (both syntactic and semantic), I'd be happy to hear them
  • the macro doesn't yet reject everything that's disallowed in the meta-language. Having a query engine will bring more clarity into what's allowed and what's not
  • pre-existing definition "bugs" (that is, places where the definition doesn't describe the intended semantics) will be fixed separately

The implementation of the used macro DSL is still missing.
It only performs basic parsing and validation.
@zerbina zerbina added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 12, 2024
@zerbina
Copy link
Collaborator Author

zerbina commented Nov 12, 2024

It's important to note that due to the - not yet implemented - compile-time evaluation and meta programming facilities, a strict separation between static semantics (i.e., types, symbols, etc.) and dynamic semantics (i.e., how a program behaves at run-time) is not going to be possible.

Not all rules mentioned in the documentation are enforced at this time.
@zerbina
Copy link
Collaborator Author

zerbina commented Jan 14, 2025

The current meta-language is a bit too limited at the moment, so I'm going to start over with it. I've made some experiments with PLT Redex, which a gave me a lot of ideas on how to improve the meta-language.

@zerbina zerbina changed the title add a basic formal definition of the source language add a meta-language for formal definitions Jan 14, 2025
The new meta-language is more powerful and also more generally
applicable, not enforcing a clear separation (or any separation at all)
between static and dynamic semantics. Not everything is implemented and
the meta-language is unlikely to be final, but it's a good start.

The source language definition using the old meta-language is removed.
The definition uses the new meta-language and tries to stay as close to
the previous textual definition as possible, even where the textual
definition is "wrong".
Except for the special `grammar` section - which is still needed by the
`passtool` -, the textual definition is fully superseded by the
meta-language definition.
@zerbina
Copy link
Collaborator Author

zerbina commented Jan 22, 2025

To keep the meta-language pragmatic, I developed a simple-but-working query engine for the meta-language during its development. Beyond making sure that the meta-language can actually be used to express the semantics of a somewhat realistic language, this also discovered a lot of issues with the current source language definition. I've fixed most of them already, but opted to keep the mostly literal translation of the language definition, so as to keep the PR smaller in scope.

The query / pattern matching engine is not included in this PR because it no longer works (patterns requiring matching to "look ahead" were disallowed in the beginning, but no longer are, and the engine hasn't caught up yet).

In general, the new meta-language is quite similar to Redex (the programming language). A major focus is on pattern matching and inductively-defined relations. The meta-language is certainly not perfect, but it's nonetheless a significant improvement over the hand-wavy notation used by the textual definition. Once there's some experience gathered regarding its shortcomings, some adjustments will most likely be in order / necessary.

I've removed the goal of writing a language-to-RST renderer from this PR, because the language definition endeavor is progressing at a far too slow pace and I need to get back to working on the language and implementation itself. The query engine is far more important at the moment, because it's what will drive the language definition tests (and the lack thereof currently blocks further progress on the source language).

@zerbina zerbina requested a review from saem January 22, 2025 23:23
@zerbina zerbina marked this pull request as ready for review January 22, 2025 23:58
Copy link
Contributor

@saem saem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, this is really cool, it can already represent so much!

spec/types.nim Outdated Show resolved Hide resolved
spec/types.nim Outdated Show resolved Hide resolved
for i in 1..<n[1].len:
case n[1][i].kind
of nnkIdent:
# cannot use 'in', because that's a special token only usable as an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dang, that would have been sick

spec/langdefs.nim Show resolved Hide resolved
spec/langdefs.nim Outdated Show resolved Hide resolved
languages/source.nim Outdated Show resolved Hide resolved
zerbina and others added 3 commits January 27, 2025 22:09
Only the `inp` identifier is allowed as a relation parameter; `out`
parses as `nkMutableTy` and is handled separately.
@zerbina zerbina merged commit 9883b33 into nim-works:main Jan 27, 2025
6 checks passed
@zerbina zerbina deleted the formal-definition branch January 27, 2025 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants