Pre-SchemeRecent PostsThu, 10 Oct 2024 17:19:39 +1000https://prescheme.org/First report on the Pre-Scheme Restoration[email protected] (Andrew Whatson)Thu, 10 Oct 2024 05:52:00 +1000https://prescheme.org/posts/first-report-on-the-pre-scheme-restoration.htmlhttps://prescheme.org/posts/first-report-on-the-pre-scheme-restoration.html<p>It's been over 3 months since kicking off the Pre-Scheme Restoration project, so it's well and truly time for a progress update! I'm pleased to report that the bulk of the port to R7RS has been completed, with approximately 75% of the codebase successfully loading in 3 different R7RS-compatible Scheme implementations (Chibi, Sagittarius, and Guile) and 100% of the codebase running via a new R7RS compatibility layer for Scheme 48. The libraries which haven't yet been ported all directly interface with the Scheme expander front-end, and replacing that with a portable expander is the next major focus of the project. In the rest of this article I'll discuss the work that's been done to get to this point, and briefly outline the upcoming work.</p><h2>Retaining compatibility with Scheme 48</h2><p>One challenge with porting software is ensuring that errors aren't introduced during the porting process. The original Pre-Scheme compiler doesn't have an established test suite, so the only real test (aside from just loading the code) is to check that it continues to translate the Scheme 48 VM to identical C code. Being an end-to-end test, this requires the entire compiler to remain functioning throughout the porting process to verify that errors haven't been introduced. Any change breaking compatibility with the original platform would prevent the test from being run, and leave us in the painful situation of only being able to detect and debug errors after everything has been ported.</p><p>To avoid this situation, I've developed <em>&quot;The Incomplete Scheme48 R7RS Compatibility Library&quot;</em> (<a href="https://codeberg.org/prescheme/s48-r7rs">s48-r7rs</a>). While not a fully compliant R7RS implementation, it's compatible enough to allow Scheme 48 to load R7RS library definitions from the filesystem, and use <code>cond-expand</code> to paper over implementation differences. This has allowed me to keep the <a href="https://codeberg.org/prescheme/ps-workbench/src/branch/main/scripts/s48-compile-vm.scm">end-to-end test</a> passing throughout the porting process, and debug any errors as they're introduced. The compatibility layer also includes an implementation of <a href="https://srfi.schemers.org/srfi-64/srfi-64.html">SRFI 64</a>, forming the basis of a portable test suite which will continue to be expanded as the project progresses.</p><h2>Using Scsh as a tooling platform</h2><p>While Scheme 48 includes a lot of functionality, it lacks some conveniences for &quot;scripting&quot; work, so I adopted <a href="https://scsh.net/">Scsh</a> as the platform for the tooling used during the port (<a href="https://codeberg.org/prescheme/ps-workbench">ps-workbench</a>). Scsh is built on top of Scheme 48, so is capable of loading the original Pre-Scheme compiler and performing introspection, while also providing better support for dealing with the filesystem and external processes. The R7RS compatibility layer also works with Scsh, and might be useful for any future efforts porting Scsh functionality to other Scheme implementations. I don't intend for Scsh (or Scheme 48) to be a hard dependency of Pre-Scheme, so much of this tooling is temporary, but might still be of some interest.</p><h2>Implementing R7RS-small for Scheme 48</h2><p>Scheme 48 is a complete R5RS implementation, so already includes the majority of functionality needed for R7RS-small. Implementing the initial compatibility layer was mainly a matter of implementing a <a href="https://codeberg.org/prescheme/s48-r7rs/src/branch/main/scheme/r7rs/libraries.scm">loader</a> for R7RS library definitions, and defining the <a href="https://codeberg.org/prescheme/s48-r7rs/src/branch/main/scheme/r7rs/packages.scm">core libraries</a> as Scheme 48 modules which re-export these existing procedures. The <a href="/papers/s48-modules.pdf">Scheme 48 module system</a> was in fact an inspiration for the <a href="https://github.com/johnwcowan/r7rs-work/blob/master/ModulesShinn.md">design of R7RS libraries</a>; the systems are similar enough that R7RS libraries can be implemented as a syntactic layer over the procedural module interface. To aid in generating the core library definitions I implemented a <a href="https://codeberg.org/prescheme/ps-workbench/src/branch/main/scsh-lib/scm-index.scm">parser</a> for the <a href="https://index.scheme.org/">Scheme Index</a> data-set, allowing the lists of exported identifiers to be generated at an Scsh REPL.</p><h2>Porting the Pre-Scheme compiler to s48-r7rs</h2><p>The bulk of the Pre-Scheme compiler depends on Scheme 48's <em>big-scheme</em> structure, which provides an extended Scheme environment with useful functionality not covered by the standard, such as hash-tables, list-queues, and a format routine. Porting to R7RS was a matter of replacing <em>big-scheme</em> imports with <em>r7rs-base</em> (aka. <em>(scheme base)</em>), and adding missing dependencies as indicated by compiler diagnostics and test failures. Non-standard dependencies have been factored out as separate libraries in the <em>(ps-compiler util)</em> namespace where they can be implemented using whatever equivalents are available in the target Scheme implementations. For example, the <a href="https://codeberg.org/prescheme/prescheme/src/branch/main/ps-compiler/util/queues.sld">(ps-compiler util queues)</a> library exports the Scheme 48 <em>queues</em> interface, using <a href="https://www.gnu.org/software/guile/manual/html_node/Queues.html">(ice-9 q)</a> on Guile and <a href="https://srfi.schemers.org/srfi-117/srfi-117.html">SRFI 117</a> on Chibi and Sagittarius. These utility libraries separate the core of the Pre-Scheme compiler from the differences of the target Scheme implementations, and provide a convenient point for test coverage.</p><h2>Porting Pre-Scheme compiler macros</h2><p>Scheme 48 provides an &quot;explicit renaming&quot; procedural macro system with the ability to break hygiene, and the Pre-Scheme compiler makes use of this for some of its internal macros. This presents a portability issue because R7RS-small (as with R5RS) only standardizes hygienic <code>syntax-rules</code> macros. In practice, most R7RS implementations offer procedural macros via either <code>er-macro-transformer</code> (ie. explicit renaming macros) or <code>syntax-case</code> (standardized in R6RS). My solution has been to re-implement all of these internal macros with <code>syntax-case</code>, and ship both versions along with <a href="https://srfi.schemers.org/srfi-211/srfi-211.html">SRFI 211</a> stubs which can be used to select the appropriate implementation for the target Scheme. An example of this is the <a href="https://codeberg.org/prescheme/prescheme/src/branch/main/ps-compiler/util/enums.sld">(ps-compiler util enums)</a> library, which provides a <code>define-enumeration</code> macro used internally in the compiler, and also exposed as part of the Pre-Scheme language.</p><h2>Porting Pre-Scheme library definitions</h2><p>Scheme 48 structures and R7RS libraries are similar enough that we can <a href="https://codeberg.org/prescheme/ps-workbench/src/branch/main/scripts/ps-library-definitions.scm">generate</a> the equivalent library definition for a structure using the procedural module interface. This is a matter of loading the Pre-Scheme codebase into Scsh, iterating through the loaded modules to identify the Pre-Scheme modules, using introspection to build export/import/include lists, and pretty-printing a library definition file. A subtle difference between Scheme 48 structures and R7RS libraries is that a structure is a view into a package (an environment), and multiple structures can be backed by the same package. I've simulated this architecture by making &quot;view&quot; libraries which simply re-export a subset of an underlying &quot;impl&quot; library, as can be seen with the <a href="https://codeberg.org/prescheme/prescheme/src/branch/main/ps-compiler/parameters.sld">parameters</a> and <a href="https://codeberg.org/prescheme/prescheme/src/branch/main/ps-compiler/set-parameters.sld">set-parameters</a> interfaces to the <a href="https://codeberg.org/prescheme/prescheme/src/branch/main/ps-compiler/parameters-impl.sld">parameters-impl</a> library, replicating the structure definitions <a href="https://codeberg.org/prescheme/prescheme/src/commit/701e61e922c7dd63232d268d600e339d706b0212/ps-compiler/package-defs.scm#L121">here</a>.</p><h2>Initial target implementations</h2><p>The current target implementations for this port (aside from s48-r7rs) are <a href="https://github.com/ashinn/chibi-scheme">Chibi Scheme</a>, <a href="https://bitbucket.org/ktakashi/sagittarius-scheme/wiki/Home">Sagittarius Scheme</a>, and <a href="https://www.gnu.org/software/guile/">Guile</a>. These implementations all support the de-facto standard filesystem layout for R7RS libraries, with directories matching namespaces and library definitions in <code>.sld</code> files, which makes them easy to support from the same source tree. They also offer a mix of <code>er-macro-transformer</code> (Chibi &amp; Sagittarius) and <code>syntax-case</code> (Guile &amp; Sagittarius) macro systems, and some variety in the set of supported SRFIs and implementation-specific libraries. I believe this selection gives &quot;just enough&quot; difference to ensure that the project architecture is flexible enough to support a variety of implementations, without getting overwhelmed by platform-specific concerns. Support for more implementations will be added as the project matures and a test-suite coalesces.</p><h2>Next steps: Portable expander, tests, and documentation</h2><p>With the bulk of the libraries converted, my immediate focus is now on integrating <a href="https://www.unsyntax.org/">Unsyntax</a> as the portable expander for Pre-Scheme, replacing the current integration with Scheme 48's expander as the front-end for the Pre-Scheme compiler. Unsyntax is particularly interesting as it's a modern and compliant Scheme implementation which is itself implemented in Scheme, and designed to run on top of other Scheme implementations. It works by expanding/compiling R7RS Scheme with <code>syntax-case</code> macros to a simpler Scheme subset that can run on a less sophisticated host implementation. This is exactly what's required for the front-end of the Pre-Scheme compiler, which will take that subset, translate it into the compiler's AST, perform type inference and static type checking, and ultimately compile the resulting program to C.</p><p>Unsyntax currently only supports Chibi Scheme as a host implementation, however an initial analysis suggests that it's only reliant on a handful of SRFIs (boxes, hash-tables, comparators, and generators) and a single non-portable feature (Chibi's type-printers, known as &quot;disclosers&quot; in Scheme 48). This should be fairly easy to port to s48-r7rs, where I can experiment with how best to wrap up its expander for use with the Pre-Scheme compiler. There is community interest in using Unsyntax with other Scheme implementations outside of this project, so I'll be documenting my efforts and contributing upstream as appropriate.</p><p>Aside from the expander, I'll continue to expand the Pre-Scheme compiler's test coverage and begin work on documenting its internal architecture. Having a decent test-suite and documentation in-place will be essential for the later stages of this project involving modifications to the type system and extensions to the core language.</p><p>If you are interested in following this project, you can <a href="https://functional.cafe/@flatwhatson">follow me</a> or <a href="https://fosstodon.org/tags/prescheme">#prescheme</a> on the fediverse, subscribe to the <a href="/atom.xml">Atom feed</a> or <a href="/rss.xml">RSS feed</a>, or join us in the <a href="https://web.libera.chat/#guile-steel">#guile-steel</a> channel on IRC. Repositories for the port, this website, and related projects can be found <a href="https://codeberg.org/prescheme/">on Codeberg</a>. Please feel free to get in touch via any of these channels with any questions about this project or related projects, I'll be happy to help.</p>Announcing the Pre-Scheme Restoration[email protected] (Andrew Whatson)Thu, 20 Jun 2024 02:45:00 +1000https://prescheme.org/posts/announcing-the-pre-scheme-restoration.htmlhttps://prescheme.org/posts/announcing-the-pre-scheme-restoration.html<p><img src="/images/prescheme-restoration-r7rs.png" alt="Pre-Scheme Restoration (R7RS)" /></p><p>I'm thrilled to announce that the Pre-Scheme Restoration project is now underway, thanks to a generous grant from the <a href="https://nlnet.nl/">NLnet</a> foundation under the <a href="https://nlnet.nl/core/">NGI Zero Core</a> program. This project is primarily an exercise in software archaeology, bringing an obscure yet important compiler to a wider audience, and using it as the basis for a modern, statically-typed, low-level functional programming language. In this article I'll provide some background on Scheme and Pre-Scheme, and discuss the objectives of this project over the coming months.</p><h2>What is Scheme?</h2><p><a href="https://www.scheme.org/">Scheme</a> is a programming language in the Lisp family, born out of the research efforts of Gerald Jay Sussman and Guy L. Steele Jr. at MIT in the mid-to-late 1970s. Their findings were published in a series of AI memos collectively known as the <a href="https://en.wikisource.org/wiki/Lambda_Papers">&quot;Lambda Papers&quot;</a>, which have influenced the design and implementation of programming languages for nearly 50 years. During that time, Scheme has been <a href="https://standards.scheme.org/">revised and standardized</a>, inspired many <a href="https://books.scheme.org/">famous books</a>, served as a foundation for a huge <a href="https://conservatory.scheme.org/readscheme/">body of research</a>, and spawned many <a href="https://get.scheme.org/">excellent implementations</a>. Perhaps most importantly, Scheme has been used to introduce computer science principles to multiple generations of students and programmers around the world.</p><p>Outside of academia, there are a number of active <a href="https://community.scheme.org/">Scheme communities</a>, mostly organized around particular implementations. A fun way that Scheme and Lisp communities come together is through the regular <a href="https://itch.io/jam/spring-lisp-game-jam-2024/results">Lisp Game Jam</a>, which saw 48 complete entries in the Spring 2024 competition, half of which were implemented in Scheme. The team at the <a href="https://spritely.institute/">Spritely Institute</a> built an impressive entry called <a href="https://spritely.institute/news/cirkoban-sokoban-meets-cellular-automata-written-in-scheme.html">Cirkoban</a>, which serves as a tech demo for their work in porting the <a href="https://spritely.institute/goblins/">Goblins</a> distributed programming environment to <a href="https://spritely.institute/hoot/">Hoot</a>, a Scheme to WebAssembly compiler. On another front, the <a href="https://guix.gnu.org/">Guix project</a> is a major force bringing new users to Scheme, providing an unparalleled foundation for free and reproducible computing. In 2023 they achieved a historic <a href="https://guix.gnu.org/blog/2023/the-full-source-bootstrap-building-from-source-all-the-way-down/">full-source bootstrap</a>, and in 2024 are making excellent progress towards the same for RISC-V. For newcomers to the language, the <a href="https://spritely.institute/static/papers/scheme-primer.html">Scheme Primer</a> offers an accessible hands-on introduction.</p><h2>What is Pre-Scheme?</h2><p>One influential Scheme implementation is <a href="https://www.s48.org/">Scheme 48</a>, written by Richard Kelsey and Jonathan Rees in the late 80s. Frustrated with the complexity of existing Lisp implementations at the time, they set out to write <a href="http://mumble.net/~jar/s48/">something simpler</a>. To bootstrap the virtual machine and garbage collector, they wrote in a restricted subset of Scheme which could be easily ported between host environments. This subset was dubbed Pre-Scheme, and a few years later Richard Kelsey developed a compiler for Pre-Scheme based on the techniques described in <a href="/papers/kelsey-diss-2012.pdf">his dissertation</a> and earlier work on a compiler for the <a href="https://paulgraham.com/thist.html">T Project</a>.</p><p>As described in the <a href="/papers/prescheme.pdf">Pre-Scheme paper</a>, the language offers a unique combination of features:</p><ul><li>Scheme syntax, with full support for macros, and a compatibility library to run Pre-Scheme code in a Scheme interpreter. The compiler is also implemented in Scheme, enabling both interactive development and compile-time evaluation.</li><li>A static type system based on Hindley/Milner type reconstruction, as used in the ML family of languages (eg. Standard ML, OCaml, Haskell). Pre-Scheme supports parametric polymorphism, and has nascent support for algebraic data types and pattern matching, which are recently gaining popularity in mainstream languages.</li><li>An optimizing compiler targeting C, allowing for efficient native code generation and portable low-level machine access. C remains the common interface language for operating system facilities, and compatibility at this level is essential for modern systems languages.</li></ul><p>Due to the restrictions of static typing and the C runtime model, Pre-Scheme does not (currently) support many of Scheme's high-level features, such as garbage collection, universal tail-call optimization, heap-allocated runtime closures, first-class continuations, runtime type checks, heterogenous lists, and the full numeric tower. Even with these limitations, Pre-Scheme enables a programming style that is familiar to Scheme programmers and more expressive than writing directly in C.</p><p>Pre-Scheme remains the bootstrapping mechanism for Scheme 48, but didn't see much adoption outside of this use-case. One reason for this is that the Pre-Scheme compiler wasn't exposed as part of an installed Scheme 48 distribution, it needs to be loaded from the source tree, making it an awkward dependency for other projects. Another reason is that the language and compiler interface weren't fully documented until the mid 2000s, when Taylor Campbell wrote <a href="/papers/s48-refman.pdf">&quot;The Nearly Complete Scheme48 Reference Manual&quot;</a> with a detailed description of Pre-Scheme tucked away in Chapter 9. In the time since, most development in Scheme has been focused on other implementations, and the innovations of Scheme 48 are mostly viewed as artifacts of history.</p><h2>Pre-Scheme Restoration</h2><p>Given the rise of modern low-level systems languages like Rust and Zig, increased recognition of statically-typed functional programming techniques thanks to the influence of Haskell, and the steady growth of a Scheme community interested in systems-level development driven by the Guix project, it seems clear that there is latent demand for a language like Pre-Scheme. Scheme offers an expressive and flexible language with good performance for general-purpose programming tasks, but there remain use-cases where dropping down to a lower-level language is beneficial, such as when interfacing with external libraries or writing performance-critical components. There are also applications where Scheme is not generally considered suitable, such as in writing virtual machines, garbage collectors, operating systems, and embedded systems. While there <a href="/papers/scheme-mobile-robots.pdf">is precedent</a> for using Scheme at this level, the use of C or other low-level languages remains common practice.</p><p>The primary objective of the Pre-Scheme Restoration project is to make Pre-Scheme available as a practical alternative to C for the wider Scheme community. This means bringing the Pre-Scheme compiler to as many Scheme implementations as possible, improving tooling so that it integrates smoothly with existing development workflows, revising the Pre-Scheme language to improve compatibility with both Scheme and C, and investing in documentation and examples so that it's easy to adopt. Due to the age and lack of exposure of the existing implementation, there are also open questions about compiler performance and correctness which will need to be addressed.</p><p>Thanks to the grant from NLnet, resources are now available to make significant progress towards meeting these objectives. A detailed plan can be found on <a href="/roadmap.html">the roadmap</a>, but the main points are as follows:</p><ol><li>Port the Pre-Scheme compiler to a selection of R7RS-compatible Scheme implementations, replacing the Scheme 48 dependency with a portable reader &amp; macro expander.</li><li>Document the compiler architecture and develop an initial test suite to aid in further porting and regression testing.</li><li>Extend the language and type system to support the set of sized numeric types commonly found in systems languages.</li><li>Complete support for sum types (tagged unions), and implement syntax for algebraic data-types and pattern matching.</li><li>Replace the default string type with a length-prefixed UTF-8 representation.</li><li>Revise other parts of the core language to improve R7RS compatibility, and implement a standard library covering as much of R7RS and relevant SRFIs as possible.</li><li>Implement a conventional command-line compiler interface, and an Emacs plugin to support interactive development.</li><li>Document the language, write tutorials, and develop some example projects.</li></ol><p>If you are interested in following this project, you can <a href="https://functional.cafe/@flatwhatson">follow me</a> or <a href="https://fosstodon.org/tags/prescheme">#prescheme</a> on the fediverse, subscribe to the <a href="/atom.xml">Atom feed</a> or <a href="/rss.xml">RSS feed</a>, or join us in the <a href="https://web.libera.chat/#guile-steel">#guile-steel</a> channel on IRC. Repositories for the port, this website, and related projects can be found <a href="https://codeberg.org/prescheme/">on Codeberg</a>.</p><h2>Acknowledgements</h2><p>This project is the continuation of my earlier efforts in porting the Pre-Scheme compiler to Guile, which were inspired by Christine Lemmer-Webber's post <a href="https://dustycloud.org/blog/guile-steel-proposal/">&quot;Guile Steel: a proposal for a systems lisp&quot;</a>. Christine has been a driving force behind this project from the beginning, cheering on those efforts, encouraging me to present at FOSDEM, and pushing me to apply for an NLnet grant. Her role in the current Scheme renaissance cannot be overstated. I'm grateful to everyone in the Scheme community who has expressed interest and support for this project, it means a lot, and I'm looking forward to future conversations and projects together. I would also like to thank Luis Filipe for the beautiful artwork, which is made available under CC BY-SA 4.0, and can be found in the <a href="https://codeberg.org/prescheme/prescheme-dot-org/">website repository</a>.</p>