add the `skully` program #70

zerbina · 2024-11-16T21:35:40Z

Summary

Skully (a portmanteau of "Skull" and "Phy") is a backend for NimSkull
that targets the L25 language. It's combined with the NimSkull
frontend into a standalone program that takes a project module as input
and writes the generated L25 code to a specified file.

Most NimSkull core language features are supported. The idea with
skully is to get access to large amounts of L25 code resulting from
real-world code, for the purpose of testing and measuring the lowering
passes and VM.

Details

The implementation is split into two parts:

the actual code generator (backend) implementing the MIR to L25
translation
the program (skully) combining the NimSkull front- and mid-end with
the Skully code generator

The L25 IL is chosen as the target language because it's the language
most similar to the MIR.

Except for C interop, the L25 is able to support the same language
features as NimSkull's C backend, and therefore sem is configured as
if it were run prior to the C backend. A non-trivial configuration is
used to make compilation possible:

the target platform is hardcoded to be any, disabling most
platform-specific code
various internal conditional symbols are enabled to disable as much
C-reliant code as possible
a hook mechanism is used to replace importc'ed procedures with
pure-NimSkull implementations or redirect them to dedicated VM host
procedures
platform-specific modules are replaced with skully-provided modules
(only os at the moment, but the facility is generic and can
redirect any module)

The VM host procedure specific to Skully are implemented into phy, so
that the phy program can run skully-produced code.

Code Generator

The code generator performs a translation from MIR code to L25 code,
attempting to replicate all lowerings NimSkull's C code generator
performs, meaning that the behaviour of the generated code should be
identical.

It's hooked into the compilation process like the native backends are:
via the backends.process iterator.

RTTIv1 generation is complicated and only needed for deepCopy and
the typeinfo, therefore it's not implemented by Skully; only the
RTTIv2 is, as it's necessary for ORC and of support.

Platform Support

Only the OS procedures necessary for compiling the phy program are
implemented at the moment.

Tests

For making sure Skully works at least somewhat, a CI job is added that
compiles phy to L25 code and then executes said L25 code to build
and run an more complex test from the expr tests.

To-Do

* procedure address types are now properly treated as pointers * all IL types are now cached, significantly reducing the amount of produced IL type

Actually generating / translating the body (by merging the fragments) is still missing -- a procedure with a body that does nothing beyond returning is used for now.

* implement various missing string magics * manually implement `mNewString` and `mNewStringOfCap`. Both are importc'ed procedures (of compilerprocs), which skully doesn't support

FFI procedures are not supported by the VM (at the moment). Some FFI procedures have a NimSkull implementation, which is enabled via the `noImported` backend option.

The "any" target is now used, which should prevent a good amount of problematic code (i.e., FFI procedures) to be included in the build.

Neither `patchFile` nor the underlying module overrides system still exist in the NimSkull compiler, and thus a third-party solution is needed. While not pretty, replacing the `TFileInfo` entry for a file works well enough.

Use a simple bump-pointer allocator for providing the OS chunks. Th maximum memory is configured during program start up (in theory; the actual implementation is still missing).

This removes the need to pass around the type and nodes separately.

A base field is now added to all records that have a base type, and `genField` emits the necessary field access chain.

They're layed out in the same way as they're by the NimSkull's `cgen`. `genField` takes care of adding the additional `Field` nodes necessary for navigating the field layout.

`result` was left uninitialized when the type had not been cached already.

The `TFrame` type is marked as imported, which results in it being translated to a `tkImported`, subsequently leading to `genField` failing. Since the type is complete, it can be patched into a non-imported type without issue.

This enables the pure-NimSkull floating point formatting implementation, removing the problematic use of `sprintf`.

The overrides are implemented in a module that's included in the compilation via an implicit import. Upon discovering an importc'ed procedure, the corresponding override is looked up, and on success, the AST of the original procedure is replaced with that of the override. Overrides only need to be provided for actually used (read, part of the live procedure graph) procedures.

This allows projects importing the `os` module to compile. An empty implementation of `getEnv` is provided, so that the `terminal` module compiles.

* numeric type descriptions are leaf nodes now * the memcopy `Copy` operation is named `Blit` now

Immutable parameters that use pass-by-reference aren't detectable as such via just the parameter type, nor do the arguments passed to them use `mnkName` -- the parameters' flags have to be inspected. Beyond improving the efficiency of the generated code, handling the `pfByRef` parameter flag also fixes the improper translation of `lent` parameters (i.e., lent parameters are now properly of pointer type in the IL).

Removing the importc flag is not enough, the type's size also has to be forcefully recomputed afterwards, as the IL currently requires all record types to have a known size.

`getSize` always ignores `openArray` types, setting its size to "unknown", which is a problem for the IL processing, which expects all records to have a valid size. Fortunately, the size of openArray types is trivial to compute.

Closure procedure types were missing the environment parameter (a pointer), resulting in lowering pass failures.

It's a simple API, meant to be wrapped by `os.paramStr` and `os.paramCount`.

The `paramCount`, `paramStr`, `commandLineParams`, and `getExecArgs` procedures are all available to skully-compiled programs now.

The `phy` arguments following a `--` are now passed on to the evaluated program through the host API.

There are some upstream fixes that skully needs.

zerbina · 2024-12-30T17:44:58Z

Okay, skully is now able to compile the phy program into L25 code, which can then be run via the native phy executable (refer to the test I added to build_and_test.yml).

This wraps up the main part of the work. Implementing some small leftovers and refactoring the code a bit should be all that there's left to do.

The magic was the last remaining piece missing for ORC support.

Appending the fragments directly to the body of the partial procedure - like the upstream backends do - is quite tricky to do with skully. Therefore, each fragment is wrapped in its own procedure instead. While a lot simpler, it does invalidate the 1-to-1 mapping between MIR and IL procedure IDs, and thus a lookup table is required now.

Control-flow is only guaranteed to reach the statement following a `mnkEndStruct` when it closes an `mnkIf`, not when it closes an `mnkExcept`.

Since compiling skully does take quite a long time, it's not automatically built when doing `koch all`.

zerbina · 2025-01-06T23:28:37Z

I've cleaned up the code a little, shuffling some code around and rewording some comments. There's probably a lot more that could be done, but given its standalone nature (i.e., skully not being part of the core), I personally think the code in its current shape is fine.

skully is - more or less - a fully-fledged compiler backend on the level of the NimSkull-native C code generator, with all the complexity that currently entails: the MIR code is not adequately lowered, thus leaving a lot of work to the downstream processing (i.e., the code generator).

The module is now named `syntax`.

zerbina · 2025-01-16T00:21:20Z

Hm, it looks like some upstream changes uncovered some code generator bug. I suspect that the fix itself is going to be relatively simple and low in impact, so I'll mark the PR as ready-for-review already.

@saem: For the review, I think it'll be easiest to review the final diff rather than going commit by commit, as there's was a lot of code churn. I'd recommend the following order of looking over the changes:

skully.nim
backend.nim
host_impl.nim changes
patch directory and the CI changes

There are most likely some latent bugs in there, and if you come across something that looks off to you, please bring it to my attention. Nonetheless, I consider skully to be stable enough for its intended purpose.

The indentation of the index operand translation was wrong, leading to the index operand being part of the `Addr` operation, not the `At`.

zerbina · 2025-01-18T00:09:34Z

As expected, the fix was very simple. The moving-around of some code had messed up indentation, resulting in trees not adhering to the L25 grammar.

implement the first draft of the Skully program

0d18db7

zerbina added the enhancement New feature or request label Nov 16, 2024

zerbina changed the title ~~add the Skully program~~ add the skully program Nov 16, 2024

zerbina mentioned this pull request Nov 17, 2024

pass0: fix Conv for float-to-float conversions #71

Merged

zerbina added 18 commits November 21, 2024 00:34

skully: more proper type translation

77f3b6c

* procedure address types are now properly treated as pointers * all IL types are now cached, significantly reducing the amount of produced IL type

skully: correctly ignore compile-time-only parameters

dc3d75e

skully: start with implementing partial procedures

f9b3382

Actually generating / translating the body (by merging the fragments) is still missing -- a procedure with a body that does nothing beyond returning is used for now.

skully: implement more string magics

c7298af

* implement various missing string magics * manually implement `mNewString` and `mNewStringOfCap`. Both are importc'ed procedures (of compilerprocs), which skully doesn't support

skully: use non-FFI version of procedures when possible

75d72fe

FFI procedures are not supported by the VM (at the moment). Some FFI procedures have a NimSkull implementation, which is enabled via the `noImported` backend option.

skully: disable C and platform specific code

5e25bc2

The "any" target is now used, which should prevent a good amount of problematic code (i.e., FFI procedures) to be included in the build.

skully: implement a module patching system

2b0aa43

Neither `patchFile` nor the underlying module overrides system still exist in the NimSkull compiler, and thus a third-party solution is needed. While not pretty, replacing the `TFileInfo` entry for a file works well enough.

skully: override the osalloc system module

3007a04

Use a simple bump-pointer allocator for providing the OS chunks. Th maximum memory is configured during program start up (in theory; the actual implementation is still missing).

skully: track the expression's type in Expr

d12f8b4

This removes the need to pass around the type and nodes separately.

skully: implement proper inheritance handling

582b0f5

A base field is now added to all records that have a base type, and `genField` emits the necessary field access chain.

skully: correctly implement case object translation

a9e9614

They're layed out in the same way as they're by the NimSkull's `cgen`. `genField` takes care of adding the additional `Field` nodes necessary for navigating the field layout.

skully: fix genType

fd03419

`result` was left uninitialized when the type had not been cached already.

skully: fix genField crash

d17358c

The `TFrame` type is marked as imported, which results in it being translated to a `tkImported`, subsequently leading to `genField` failing. Since the type is complete, it can be patched into a non-imported type without issue.

skully: enable nimPreviewFloatRoundtrip

ecc4c2b

This enables the pure-NimSkull floating point formatting implementation, removing the problematic use of `sprintf`.

skully: fix mnkExcept translation

8794aca

skully: patch the os module

fb41559

This allows projects importing the `os` module to compile. An empty implementation of `getEnv` is provided, so that the `terminal` module compiles.

skully: implement the mEcho magic

7c50882

zerbina mentioned this pull request Nov 29, 2024

lang0: implement foreign procedure support #78

Merged

zerbina added 7 commits November 30, 2024 15:06

Merge branch 'main' into skully

09ab1e8

skully: adjust to the upstream changes

dff0983

* numeric type descriptions are leaf nodes now * the memcopy `Copy` operation is named `Blit` now

skully: fix TFrame having an unknown size

8a30578

Removing the importc flag is not enough, the type's size also has to be forcefully recomputed afterwards, as the IL currently requires all record types to have a known size.

skully: add missing closure environment parameter

14819e4

Closure procedure types were missing the environment parameter (a pointer), resulting in lowering pass failures.

skully: correctly use seq/string payload types

80487f0

zerbina mentioned this pull request Dec 28, 2024

fix bug in linker's import resolution #102

Merged

zerbina added 5 commits December 30, 2024 17:10

host_impl: add an API for querying program parameters

a03e10e

It's a simple API, meant to be wrapped by `os.paramStr` and `os.paramCount`.

implement os module's program parameter querying API

a8ef9b5

The `paramCount`, `paramStr`, `commandLineParams`, and `getExecArgs` procedures are all available to skully-compiled programs now.

phy: implement execution argument support

b8dd875

The `phy` arguments following a `--` are now passed on to the evaluated program through the host API.

Merge branch 'main' into skully

0cc5eec

ci: bump the required NimSkull version

fa8bd4e

There are some upstream fixes that skully needs.

zerbina added 10 commits January 5, 2025 18:43

skully: implement mGetTypeInfoV2; enable ORC

65875a2

The magic was the last remaining piece missing for ORC support.

Merge branch 'main' into skully

28a724c

skully: report an error when using dynlib procedures

544104e

skully: fix mnkEndStruct translation

c959327

Control-flow is only guaranteed to reach the statement following a `mnkEndStruct` when it closes an `mnkIf`, not when it closes an `mnkExcept`.

skully: small refactorings; touch-up some comments

eab0d0f

update the patch directory's readme

2a8b7ae

move the code generation logic into its own module

2f0633a

koch: add skully as a program

8aff066

Since compiling skully does take quite a long time, it's not automatically built when doing `koch all`.

ci: use koch for building skully

61fadeb

zerbina mentioned this pull request Jan 8, 2025

The Road To Version 0.5 #106

Closed

10 tasks

zerbina added 3 commits January 15, 2025 22:44

Merge branch 'main' into skully

c944a5b

update the outdated spec imports

e384d81

The module is now named `syntax`.

host_impl: fix the toHost template

b39408e

zerbina marked this pull request as ready for review January 16, 2025 00:21

zerbina requested a review from saem January 16, 2025 00:21

backend: fix toOpenArray for cstring and ptr operands

264efc5

The indentation of the index operand translation was wrong, leading to the index operand being part of the `Addr` operation, not the `At`.

saem approved these changes Jan 18, 2025

View reviewed changes

saem merged commit 9eab5f5 into nim-works:main Jan 18, 2025
6 checks passed

zerbina deleted the skully branch April 2, 2025 16:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add the `skully` program #70

add the `skully` program #70

Uh oh!

zerbina commented Nov 16, 2024 •

edited

Loading

Uh oh!

zerbina commented Dec 30, 2024

Uh oh!

zerbina commented Jan 6, 2025

Uh oh!

zerbina commented Jan 16, 2025

Uh oh!

zerbina commented Jan 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add the skully program #70

add the skully program #70

Uh oh!

Conversation

zerbina commented Nov 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Details

Code Generator

Platform Support

Tests

To-Do

Uh oh!

zerbina commented Dec 30, 2024

Uh oh!

zerbina commented Jan 6, 2025

Uh oh!

zerbina commented Jan 16, 2025

Uh oh!

zerbina commented Jan 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add the `skully` program #70

add the `skully` program #70

zerbina commented Nov 16, 2024 •

edited

Loading