graydon2: (Default)
This is a small note about a delightful function. Not cryptography advice or serious commentary. Just amusement.

A couple years back I had occasion to read in slightly more detail than I had before about the state of the art in cryptographically secure PRNGs (CSPRNGs). These are PRNGs we trust to have additional properties beyond the speed and randomness requirements of normal ones -- inability for an attacker to reveal internal state, mainly, so you can use them to generate secrets.

If you look, you'll find a lot of people recommending something based on one of Dan Bernstein's algorithms: Salsa20 or ChaCha (or even more obscurely "Snuffle"). All the algorithms we're discussing here are very similar in design, and vary only in minor details of interest only to cryptographers.

If you follow that link though, you'll notice it's a description of a (symmetric) stream cipher. Not a CSPRNG at all!

But that's ok! Because it turns out that people have long known an interesting trick -- actually more of a construction device? -- which is that a CSPRNG "is" a stream cipher. Or rather, if you hold it the other way, you might even say a stream cipher "is" just a CSPRNG. Many stream ciphers are built by deriving an unpredictable "key stream" off the key material and then just XOR'ing it with the plaintext. So long as the "key stream" is unpredictable / has unrecoverable state, this is sufficient; but it's the same condition we want out of the stream of numbers coming out of a CSPRNG, just with "seed" standing in for "key". They're fundamentally the same object.

I knew all this before, so people naming a CSPRNG and a stream cipher the same did not come as any surprise to me. But I went and looked a little further into ChaCha in particular (and its ancestor Salsa and, earlier still, Snuffle) because they have one additional cool and weird property.

They are seekable.

This means that you can, with O(1) effort, "reposition" the Snuffle/Salsa/ChaCha "key stream" / CSPRNG number stream to anywhere in its future. You want the pseudorandom bytes for block 20,000,000? No problem, just "set the position" to 20,000,000 and it will output those bytes. This is not how all CSPRNGs or stream ciphers work. But some do. ChaCha does! Which is very nice. It makes it useful for all sorts of stuff, especially things like partially decrypting randomly-read single blocks in the middle of large files.

I got to wondering about this, so I went back and read through design docs on it, and I discovered something surprising (to me): it's not just a floor wax and dessert topping CSPRNG and stream cipher. ChaCha is also a cryptographic hash function (CHF)! Because a CHF is also something you can build a CSPRNG out of, and therefore also build a stream cipher out of. They're all the same object.

How does the construction work? Embarassingly easily. You put the key material and a counter (and enough fixed nonzero bits to make the CHF happy) in an array and hash it. That's it. The hash output is your block of data. For the next block, you increment the counter and hash again. Want block 20,000,000? Set the counter to 20,000,000. The CHF's one-way-function-ness implies the non-recoverability of the key material and its mixing properties ensure that bumping the counter is enough to flip lots of bits. The end.

Amazing!

But then I got curious and dug a bit into the origins of ChaCha and .. stumbled on something hilarious. In the earliest design doc I could find (Salsa20 Design which still refers to it as "Snuffle 2005") the introduction starts with this:

Fifteen years ago, the United States government was trying to stop publication
of new cryptographic ideas—but it had made an exception for cryptographic
hash functions, such as Ralph Merkle’s new Snefru.

This struck me as silly. I introduced Snuffle to point out that one can easily
use a strong cryptographic hash function to efficiently encrypt data.
Snuffle 2005, formally designated the “Salsa20 encryption function,” is the
latest expression of my thoughts along these lines. It uses a strong cryptographic
hash function, namely the “Salsa20 hash function,” to efficiently encrypt data.

This approach raises two obvious questions. First, why did I choose this
particular hash function? Second, now that the United States government seems
to have abandoned its asinine policies, why am I continuing to use a hash function
to encrypt data?


In other words: the cool seekability wasn't a design goal. Shuffle/Salsa/ChaCha was intended as a tangible demonstration of a political argument that it's stupid to regulate one of the 3 objects (CHF, CSPRNG and stream cipher) since you can build them all out of the CHF. (And, I guess, "obviously you should be allowed to export CHFs" though I wouldn't bet on anything being obvious to the people who make such decisions).

And then I googled more and realized that when I was a teenager I had completely missed all the drama / failed to connect the dots. Snuffle was the subject of Bernstein v. United States, the case that overturned US export restrictions on cryptography altogether! And as this page points out "the subject of the case, Snuffle, was itself an attempt to bypass the regulations".

Anyway, I thought this was both wonderful and funny: both the CHF-to-CSPRNG construction (which I'd never understood/seen before), but also the fact that Snuffle/Salsa/ChaCha is like the ultimate case of winning big in cryptography. Not only does ChaCha now transport like 99%[EDIT "double-digit percentages"] of the world's internet traffic (it's become the standard we all use because it's fast and secure) but that it was pivotal in the evolution of the legal landscape and all arises from a sort of neener-neener assessment that the law at the time was internally inconsistent / contained a loophole for CHFs that made the whole thing "asinine".

consensus

Jul. 27th, 2025 11:29 pm
graydon2: (Default)
I've been working in and around consensus systems for a long time now and I still find a lot of the writing on the subject very vexing. The protocols are often very complex and very subtle and it is often very hard to see the forest for the trees.

I decided this weekend to try to write down what I am fairly sure I actually know about consensus algorithms, in a form I could fit on one page. Here it is!

Consensus



We want some computers to agree on a thing, but there are problems:

  1. The speed of light
  2. Computers can stop running
  3. Computers can keep running but be disconnected
  4. Computers can keep running but have bugs and lie


We have some tricks that we have found help, that most consensus algorithms are some combination of:

  1. Promises
  2. Voting
  3. Preemption
  4. Epistemic levels (I know that you know that I know...)
  5. Digital signatures


In particular:

  1. If you promise me you'll do something in the future, and put that in an envelope, then even if I don't get the message for a while because of the speed of light I will know at least a little about your current instantaneous state when I get the message: that you're doing, or about to do, or have done the thing you promised (unless you crashed or lied). It narrows the space of possible futures: if you're an honest node, you won't be doing some other thing than what you promised. Faster-than-light communication!

  2. If we agree not to do a thing unless there's a majority passing a vote then:

    1. We can pick a fault probability and then size the number of nodes to ensure that vote-majorities are achievable even with the expected worst case number of simultaneous faults, which handles most cases of nodes stopping.

    2. We know there can't be two different things decided by a majority because there can only be one majority, which helps with some bugs (though not lies) and races due to the speed of light.

    3. We know that if there's a network partition then only one side can possibly get to a majority, and the other can only get a minority and never observe the majority, which transforms the nodes in the minority into a special case of "stopping".


  3. If we agree on a mechanism where a vote initiated by one node can be preempted by another node initiating a new vote, then we can handle any case where the initiating node itself fails or is isolated in a partition or is slow or whatever.

  4. If you tell me not just what you know about yourself but also what you know about other nodes, or even what those nodes told you about what they know about other nodes (ideally conveyed with digital signatures so I know you aren't lying about what others said to you), then:

    1. I can ask you to tell me both your vote and, later, what you saw as the outcome of everyone else's voting. This further limits the set of possible futures of the system: once I know a majority say that a majority voted for X then the decision for X is no longer subject to being lost or reversed due to even massive node stoppage or network failure. Any one node in that majority has enough evidence to convince anyone else that X is the majority decision. A node can only decide not-X after that point if someone was buggy or lying.

    2. I (or other correct-and-honest nodes) can compare those observations and catch any bugs-or-liars, at least up to some given scale of collusion, because some buggy-or-lying node L would by definition be making statements inconsistent with the protocol (eg. "I vote X" as part of a majority, but later "I saw the majority including myself vote Y"). And this "say X and also Y" inconsistency will be caught. If L sends this pair of inconsistent messages directly to some correct-and-honest node M, then M immediately knows L is buggy or lying. If L sends the X message to M and the Y message to some other correct-and-honest node N, it still gets caught when M and N exchange their statements-received-from-L. And this covers both intentional lies and bugs -- though an intentional liar knowing this part of the protocol exists will not bother to lie because they know it will be caught, so it only leaves bugs.




I think that's it! There's also some more stuff about livelocks, timeouts, leadership, fast paths, censorship resistance, fairness, pipelining and information dependencies that all seem like .. eh .. details? Fine tuning? Perhaps just "easier to digest in small pieces if you can keep the big picture in mind". Also much more variable between different protocol families.

Anyway I hope this is useful to someone besides myself. At least I can print it out and stick it on my wall when I am next trying to decide what makes a give protocol safe. Also if you think I've either overlooked one of the main big picture elements and/or gotten one of them wrong, please let me know!
graydon2: (Default)
A long time ago I wrote on twitter (now erased): "surprising how much computer stuff makes sense viewed as tragic deprivation of sum types".

Sum types (a.k.a. disjoint unions a.k.a. tagged unions a.k.a. safe variant types) are of course wonderful and nice and everyone should have them. ML and Haskell have them, and now Rust has them, as does Swift, and Scala with case classes, and C++ kinda with std::variant, and .. lots of modern languages are growing them. Great, hurrah.

But there is just a little bit of subtlety to doing them right. The subtlety is that you have to build the language to couple together two pieces of data -- a tag (or "discriminant") and an unsafe union -- with mandatory syntactic constructs (switch or match expressions) so that you only get to access the unsafe union elements when you've already checked the tag, and the type system and name resolution systems know you're in a given case-block, and so only let you access to the union-fields (properly typed) corresponding to the tag case. It's not a hugely complex thing to get right, but you have to get it right.

There are three main degenerate ways you can fail to get it right.

  1. You can give users syntactically unguarded access to union members, say by using container.field syntax, in which case all you can do if the tag doesn't match that field at runtime is to raise a runtime error, which you can at least do systematically, but the ergonomics are lousy: it's inefficient (you wind up checking twice) and it doesn't help the user avoid the runtime error by statically forcing cases to be handled.

  2. You can do #1 but then also fail to even raise a runtime error when the tag is wrong. At which point the tag is basically "advisory", so then...

  3. You can even go a step further than #2 and not even require users to declare a tag. Just let the user "know the right case" using some unspecified method, an invariant they have to track themselves. They can use a tag if they like, or some bits hidden somewhere else, who knows.


Ok so .. Casey Muratori gave a great recent talk on the origin of, well, certain OOP-y habits of encapsulation, which is mostly about Entity-Component-System organization, but .. also a bit about sum types. He spent a lot of time digging in the PL literature and reconstructing the history of idea transmission. I just watched it, and it's a great talk and you should go watch it, it's here:



One of the things he discusses in here is that safe and correctly-designed disjoint unions aren't just an ML thing, they were around in the early 60s at least. Wikipedia thinks Algol 68. Muratori places their origin in Doug Ross and/or Tony Hoare talking about amendments to Algol 60, linking to this Tony Hoare paper but of course Hoare references PL/I there and I'm honestly not sure about the exact origin, it's somewhere around there. Point being it predates ML by probably a decade.

But another thing Muratori points out is that is that Dahl and Nygaard copied the feature in safe working form into Simula, and Stroustrup knew about it and intentionally dropped it from C++, thinking it inferior to the encapsulation you get from inheritance. This is funny! Because of course C already had case #3 above -- completely unchecked/unsafe unions, they only showed up in 1976 C, goodness knows why they decided on that -- and the safe(ish) std::variant type has taken forever to regrow in C++.

I was happy to hear this, because it mirrors to some extent another funny story I have reconstructed from my own digging in the literature. Namely about the language Mesa, a Butler Lampson project from PARC, very far ahead of its time too. There's far too much to discuss about Mesa to get into here -- separate compilation, interface/implementation splits, etc. -- but it did also have safe variant types (along with a degenerate form called "computed" which is unsafe). Presumably it picked them up from Algol 68, or Simula 67, or one of probably dozens of languges in the late 60s / early 70s it emerged from. No big deal.

Where that relatively unremarkable fact turns into a funny story is during a "technology transfer" event that happened during Mesa's life: Niklaus Wirth took a sabbatical to PARC, and became quite enamoured with Mesa, and went back home to work on what became Modula and eventually Modula 2. But Modula 2 had only degenerate variant types! He copied them not from Mesa, but in the same busted form they existed in Pascal, completely missing that Mesa did them correctly. Modula 2 variants are degenerate case #1 above (if you turn on a special checking compilation mode) otherwise case #2: tags declared but no checking at all! He even writes it up in the report on Modula 2 and Oberon, criticizing its lack of safety while simultaneously citing all the ways Mesa influenced his design. Evidently not enough.

Anyway, all this is to say: language features are easily broken, mis-copied, forgotten or intentionally omitted due to the designer's pet beliefs. Progress is very circuitous, if it exists at all!
graydon2: (Default)
somewhat contrary to the previous post: another pet peeve is people (some classic libertarians, others paleoconservatives) saying that "there is no free lunch".

there are certainly zero-sum, no-win situations that occur in life sometimes. and the phrase is often also used in some fatalistic, eschatological, broad heat-death-of-the-universe sense. sure sure, second law.

but as the great MC hawking put it: the earth's not a closed system, it's powered by the sun. there's an effectively unlimited massive fusion reactor in space we literally all live off of, and for all practical purposes we always have and always will. it is very much a free lunch! concretely: every lunch you have ever eaten and will ever eat is a free lunch given to earth by the sun.

also like .. any technological improvement that increases efficiency of some work is a metaphorical free lunch. if there were no free lunches to be had from R&D we may as well be banging rocks together as using any later developments.

also any positive-sum games or interactions, social relationships, political organization, economic cooperation .. the list of free lunches goes on and on.

enjoy your free lunches!
graydon2: (Default)
reading some technology reporting today and they're using the term flywheel, a pet peeve of mine. I have written about this before on some ephemeral social media, but I am moved now to repeat my objection in a place of greater posterity.

the word "flywheel" seems to have entered the business-writing lexicon with a book about amazon called "good to great". I have not read this and have no intention of reading it. perhaps the metaphor was used sensibly there. it is no longer used sensibly. here are a couple representative samples I just found via google of the way it is used nowadays (emphasis mine):

By definition, a flywheel is a heavy revolving wheel that is used in a machine to increase momentum and therefore provide greater stability to the machine. Given its weight, the flywheel is difficult to push from a standstill, but once it starts moving it gradually builds momentum, which eventually enables the wheel to turn by itself and create even more of its own momentum through a self-reinforcing loop.

or:

A flywheel is a massive metal disk, or wheel, that often weighs over 2,000 kgs. It takes a lot of effort to get it started, but once it starts to turn there are counterweights around the outside of the wheel that start to take effect and it starts to build momentum almost by itself. From that point, the same effort can be placed on the flywheel and it will start to turn faster and faster.

this is characteristic of the way people use the term now. they talk about "getting the flywheel going" on their business, because once you're over some kind of threshold the flywheel will somehow magically start spinning faster and faster on its own.

that is not what a flywheel is or what it does at all.

a flywheel is a kinetic battery. you put angular momentum into it when you have a surplus and you can take some back out when you have a deficit (assuming friction hasn't lost it all yet). other metaphors that have a similar effect are account balances, or warehouse inventory, or queues. or taking an average of something noisy over time. take your pick.

a wheel that somehow went faster of its own volition would be a perpetual motion machine. a fantastical solution to all the world's energy needs. but also fairly prohibited by .. physics. there is no such thing.

people using the metaphor this way seem to be getting it confused with positive feedback phenomena. which do exist! even in business! here is a classic one: sales volume up => unit production cost down => sale price down => customer demand up => sales volume further up. a.k.a. "economies of scale". great stuff, bravo capitalism. it has some other positive feedback loops that are not so great like "overproduction crisis" or "market panic" but we need not dwell on those.

there are also lots of other non-capitalism examples of positive feedback phenomena. population growth, cytokine storms, even the digital flip-flop circuits storing this post are positive feedback systems.

but: a flywheel is not a positive feedback system. not at all. please, I beg you: for pity sake stop using it as a muddled metaphor for one.
graydon2: (Default)
I just noticed this post from around when I first learned the term "substructural type system" is (almost) 20 years old. That sure was a while ago.

(I knew the space of ideas already but was working my way through more legit treatments of it. amusingly none of the research links in that post are live anymore -- back then arxiv.org was still "xxx.lanl.gov" haha)
graydon2: (Default)
Elsewhere I've been asked about the task of replaying the bootstrap process for rust. I figured it would be fairly straightforward, if slow. But as we got into it, there were just enough tricky / non-obvious bits in the process that it's worth making some notes here for posterity.

UPDATE: someone has also scripted many of the subsequent snapshot builds covering many years of rust's post-bootstrap development. Consider the rest of this post just a verbose primer for interpreting their work.

context


Rust started its life as a compiler written in ocaml, called rustboot. This compiler did not use LLVM, it just emitted 32-bit i386 machine code in 3 object file formats (Linux PE, macOS Mach-O, and Windows PE).

We then wrote a second compiler in Rust called rustc that did use LLVM as its backend (and which, yes, is the genesis of today's rustc) and ran rustboot on rustc to produce a so-called "stage0 rustc". Then stage0 rustc was fed the sources of rustc again, producing a stage1 rustc. Successfully executing this stage0 -> stage1 step (rather than just crashing mid-compilation) is what we're going to call "bootstrapping". There's also a third step: running stage1 rustc on rustc's sources again to get a stage2 rustc and checking that it is bit-identical to the stage1 rustc. Successfully doing that we're going to call "fixpoint".

Shortly after we reached the fixpoint we discarded rustboot. We stored stage1 rustc binaries as snapshots on a shared download server and all subsequent rust builds were based on downloading and running that. Any time there was an incompatible language change made, we'd add support and re-snapshot the resulting stage1, gradually growing a long list of snapshots marking the progress of rust over time.

time travel and bit rot


Each snapshot can typically only compile rust code in the rust repository written between its birth and the next snapshot. This makes replaying the entire history awkward (see above). We're not going to do that here. This post is just about replaying the initial bootstrap and fixpoint, which happened back in April 2011, 14 years ago.

Unfortunately all the tools involved -- from the host OS and system libraries involved to compilers and compiler-components -- were and are moving targets. Everything bitrots. Some examples discovered along the way:

  • Modern clang and gcc won't compile the LLVM used back then (C++ has changed too much -- and I tried several CXXFLAGS=-std=c++NN variants!)
  • Modern gcc won't even compile the gcc used back then (apparently C as well!)
  • Modern ocaml won't compile rustboot (ditto)
  • 14-year-old git won't even connect to modern github (ssh and ssl have changed too much)


debian


We're in a certain amount of luck though:

  • Debian has maintained both EOL'ed docker images and still-functioning fetchable package archives at the same URLs as 14 years ago. So we can time-travel using that. A VM image would also do, and if you have old install media you could presumably build one up again if you are patient.
  • It is easier to use i386 since that's all rustboot emitted. There's some indication in the Makefile of support for multilib-based builds from x86-64 (I honestly don't remember if my desktop was 64 bit at the time) but 32bit is much more straightforward.
  • So: docker pull --platform linux/386 debian/eol:squeeze gets you an environment that works.
  • You'll need to install rust's prerequisites also: g++, make, ocaml, ocaml-native-compilers, python.


rust


The next problem is figuring out the code to build. Not totally trivial but not too hard. The best resource for tracking this period of time in rust's history is actually the rust-dev mailing list archive. There's a copy online at mail-archive.com (and Brian keeps a public backup of the mbox file in case that goes away). Here's the announcement that we hit a fixpoint in April 2011. You kinda have to just know that's what to look for. So that's the rust commit to use: 6daf440037cb10baab332fde2b471712a3a42c76. This commit still exists in the rust-lang/rust repo, no problem getting it (besides having to copy it into the container since the container can't contact github, haha).

LLVM


Unfortunately we only started pinning LLVM to specific versions, using submodules, after bootstrap, closer to the initial "0.1 release". So we have to guess at the LLVM version to use. To add some difficulty: LLVM at the time was developed on subversion, and we were developing rust against a fork of a git mirror of their SVN. Fishing around in that repo at least finds a version that builds -- 45e1a53efd40a594fa8bb59aee75bb0984770d29, which is "the commit that exposed LLVMAddEarlyCSEPass", a symbol used in the rustc LLVM interface. I bootstrapped with that (brson/llvm) commit but subversion also numbers all commits, and they were preserved in the conversion to the modern LLVM repo, so you can see the same svn id 129087 as e4e4e3758097d7967fa6edf4ff878ba430f84f6e over in the official LLVM git repo, in case brson/llvm goes away in the future.

Configuring LLVM for this build is also a little bit subtle. The best bet is to actually read the rust 0.1 configure script -- when it was managing the LLVM build itself -- and work out what it would have done. But I have done that and can now save you the effort: ./configure --enable-targets=x86 --build=i686-unknown-linux-gnu --host=i686-unknown-linux-gnu --target=i686-unknown-linux-gnu --disable-docs --disable-jit --enable-bindings=none --disable-threads --disable-pthreads --enable-optimized

So: configure and build that, stick the resulting bin dir in your path, and configure and make rust, and you're good to go!
root@65b73ba6edcc:/src/rust# sha1sum stage*/rustc
639f3ab8351d839ede644b090dae90ec2245dfff  stage0/rustc
81e8f14fcf155e1946f4b7bf88cefc20dba32bb9  stage1/rustc
81e8f14fcf155e1946f4b7bf88cefc20dba32bb9  stage2/rustc


Observations


On my machine I get: 1m50s to build stage0, 3m40s to build stage1, 2m2s to build stage2. Also stage0/rustc is a 4.4mb binary whereas stage1/rustc and stage2/rustc are (identical) 13mb binaries.

While this is somewhat congruent with my recollections -- rustboot produced code faster, but its code ran slower -- the effect size is actually much less than I remember. I'd convinced myself retroactively that rustboot was produced abysmally worse code than rustc-with-LLVM. But out-of-the-gate LLVM only boosted performance by 2x (and cost of 3x the code size)! Of course I also have a faster machine now. At the time bootstrap cycles took about a half hour each (according to this: 15 minutes for the 2nd stage).

Of course you can still see this as a condemnation of the entire "super slow dynamic polymorphism" model of rust-at-the-time, either way. It may seem funny that this version of rustc bootstraps faster than today's rustc, but this "can barely bootstrap" version was a mere 25kloc. Today's rustc is 600kloc. It's really comparing apples to oranges.
graydon2: (Default)
Today I was asked -- as I am frequently asked! -- about the origins of something in Rust. In this case the origin of the "borrow" terminology. Not the idea, just the word. Who first used that word?

The short answer to this is "as far as I know, John Boyland, in his 1999 paper Alias killing: Unique variables without destructive reads" (later expanded-on in a more widely cited 2000 paper where he renamed it to the slightly less dramatic "alias burying").

John's paper was read by and influenced the Cyclone folks somewhere between 2002 and 2007. And both of these works were read by and influenced Niko Matsakis when he proposed upgrading Rust's existing and somewhat half-baked alias checker into something resembling today's region/lifetime/borrowing system (at the time this was motivated by soundness issues emerging from interactions with Rust's at-the-time half-baked mutability rules, creating the so-called "Dan's Bug").

This was not impossible to find -- I know a few extra places to look mind you -- but digging it up produced a couple surprises to me:

  • I thought -- and nearly answered the person asking me -- that one of us in the Rust project had coined the term "borrow" as an ergonomic adaptation, because "region" and "alias" are clunky and nonintuitive. This was a retcon on my part! Easy to imagine, but wrong.

  • Even more interesting to me is how many other works there are in the Related Works section of John's paper! Not just things I've seen before[1] (eg. Cyclone, Tofte & Talpin's ML Kit, Ada, Euclid[2]) but also a spectrum of delightfully weird others: Eiffel*, Islands, Balloons, FAP, ESC with its "virgin, pivot and plenary" references, etc. This reference-set pulls at threads happening in multiple teams/projects/institutions, all through the 90s, 80s, even back into the 1970s, eg. here is Reynolds grappling with it in 1978.


Now, I don't want to minimize Niko's work here at all -- he's responsible for doing the lion's share of detailed design and implementation work on Rust's borrowing system in particular, along with many many other aspects of the language -- but I do think this is an illustrative example of something we should all keep in mind: little-if-anything intellectual happens in a vacuum, few-if-any ideas spring forth fully formed. And we're all curiously willing to retcon the origins of such things. Even things we ourselves participated in!

(I am reminded of similar -- much weightier -- arguments that I have read about whether Darwin invented the concept of Evolution, or Einstein of Relativity, or Planck of Quantum Mechanics. On close inspection the person who gets cited is always part of a lengthy and ongoing research program spanning decades, never just "inventing stuff in isolation". But we are all very inclined to imagine it as so!)




[1]: I have elsewhere written about precedents-I-knew-about when working on Rust on my own, and designing the original somewhat half-baked move-semantics and mutable-alias-control systems:

[2]: A genuinely odd coincidence is that the earliest reference I can find to anyone grappling with mutable aliasing control in a PL design -- not just a complaint about how it breaks Hoare[3] logic -- is Euclid, which was a multi-site / multi-collaborator project including the one and only Butler Lampson, but was fairly substantially focused in Toronto (Horning, Holt, Cordy) and published in 1977. I was born in 1977 and grew up in Toronto, so somehow this all seems like, I don't now, something that was in the air?


[3]: No relation. I know, right? How can this be?

PERQ

Sep. 7th, 2024 01:12 am
graydon2: (Default)
A note on the PERQ computer.

technical history muttering )
graydon2: (Default)
I spent another couple days working with talon (and its many extensions) for voice-coding and found a good set of additional extensions and modes: a multiple-named-clipboards mode, a dense grid mouse navigaiton mode, a window-moving mode, and several sub-modes of the standard community library for multiple cursors, find-and-replace, page and tab nav, named abbreviations and websites, and so forth.

I've collected a cheatsheet here if anyone's curious.

As it happens I also spent a few days going carefully through different muscle groups along my arms and back and found a gigantic knot under my right shoulder which I think accounts for most of the wrist trouble. I spent a bunch of time with a lacrosse ball (hint: these are the best muscle-release massage equipment in the entire world and are also very cheap and convenient) and it seems to be mostly fixed now? So my actual motivation to keep working with talon has gone from relatively-pressing to moderate-residual-curiosity and I'm mostly just typing again now. I'll see if this keeps up or if I go back to it!
graydon2: (Default)
My wrist has been hurting a little bit recently, and a friend of mine has been telling me about how he's using Talon (and its extensions Cursorless and Rango) as an interface to programming entirely with his voice, so I thought I'd give it a try. I'm now about three days into using it, and I have it working well enough that I just composed my first real patch that makes a not completely trivial change to a program, and while it took me almost an entire workday to complete a 165-line patch, I can definitely feel myself accelerating and gaining fluency in the way the system works.

(This blogpost is also being mostly composed using Talon)

notes )
graydon2: (Default)
Recently Boats wrote a blog post about Rust, mutable aliasing, and the sad story of local reasoning over many decades of computer science. I recommend that post and agree with its main points! Go read it! But I also thought I'd add a little more detail to an area it's less acutely focused on: formal methods / formal verification.

TL;DR: support for local reasoning is a big factor in the ability to do automated reasoning about programs. Formal verification involves such reasoning. Rust supports it better than many other imperative systems languages -- even some impure functional ones! -- and formal verification people are excited and building tools presently. This is not purely by accident, and is worth understanding as part of what makes Rust valuable beyond "it doesn't need a GC".

The rest of this post is just unpacking and giving details of one or more of the above assertions, which I'll try to address in order of plausible interestingness to the present, but I will also throw in some history because I kinda can't help myself.

long post for the PL nerds in the audience )

reading

Jan. 14th, 2024 01:31 pm
graydon2: (Default)
I'm hoping to do a little more reading in 2024 than I managed in 2023 (at least in part due to work calming down a bit). This week I finished off two books from last year.
blah blah )
graydon2: (Default)
In a recent podcast about Rust leadership, the BDFL question came up again and Jeremy Soller said (in the understatement of the century) that "I believe Graydon would have said no to some things we all like now". And this echoes a different conversation on reddit where I was reminded that I meant to write down at some point how "I would have done it all differently" (and that this would probably have been extremely unsatisfying to everyone involved, and it never would have gone anywhere).

Boy Howdy would I ever. This is maybe not clear enough, and it might make the question of whether the project "really should have had a BDFL" a little sharper to know this: the Rust We Got is many, many miles away from The Rust I Wanted. I mean, don't get me wrong: like the result. It's great. I'm thrilled to have a viable C++ alternative, especially one people are starting to consider a norm, a reasonable choice for day-to-day use. I use it and am very happy to use it in preference to C++. But!

There are so, so many diferences from what I would have done, if I'd been "in charge" the whole time.

differences! )
graydon2: (Default)
Over on the socials, someone asked "Do you ever wish you had made yourself BDFL of the Rust project? Might there be less drama now if the project had been set up that way?"

This is a tricky question, not because of what it's superficially asking -- the answer there is both "no" and "no" -- but because of what I think it's indirectly asking. I think it's asking: is there some sort of original sin or fatal flaw in the Rust Project's governance -- perhaps its lack of a BDFL -- that's to blame for its frequent internal sociopolitical crises?

To that I can speak, hopefully briefly, or at least concretely. I think there are a few underlying causes, and one big pattern.
governance )
graydon2: (Default)
A short note about corporate free / open source software (FOSS), and corporate-employed maintainers. Or specifically "corporate-employed maintainers .. with bad incentives". I tried to come up with a pithy name, but that's the best I could do: CEMBIs.

I've worked professionally in FOSS for a long time. I've seen a lot of corporate approaches to FOSS participation over that time, and seen the FOSS community develop a fairly nuanced understanding of what sorts of corporate strategies are welcome or unwelcome, healthy or harmful to a project. We have articulated a lot of thoughts about (say) open core and dual licensing business models, or (say) which forms of corporate "embrace" represent a step on an "extend/extinguish" path, or (say) which forms of telemetry are appropriate and which put users at risk of surveillance.

What I haven't seen a lot of discussion of, and wish I did see, is the structure and content of relationships that exist between corporate-employed FOSS maintainers and their employers. And I think this matters because the people doing corporate FOSS aren't soul-less automata executing corporate strategy. They are people with their own motivations, incentives, a certain amount of autonomy, but (most relevant to my concerns) a set of performance-evaluation criteria they have to satisfy to remain employed and/or get promoted.

Companies incentivize lots of things, but I worry (and in many cases I either know first hand or have heard second hand) that companies often have an incentive structure that rewards novelty, especially in the form of features if not entire products. There's a good reason for this when the company is "making products to sell": this year's new-and-improved is always sellable over last year's tired old model. But it's also just a reflection of the "growth orientation" or capitalism in general: whatever activity the company accomplished last year, a common measure of health and vitality isn't consistent execution but growth of the business. Failure to grow is treated as stagnation which is treated as equivalent to death. This orientation can be embedded so deep in a company's DNA that it's the incentive structure given to everyone who works there, regardless of whether they're making products, auditing the accounts, or maintaining corporate infrastructure.

And to a large measure, that's what FOSS is. Not always, but usually: infrastructure. It's stuff that's supposed to work the same way from one day to the next. Stuff that's not supposed to be noticed because it just works. Reliably, efficiently, silently. Stuff that has a massive installed base of users relying on it, massive social and institutional inertia, and thereby massive (and sensible) built-in resistance to novelty. You don't actually want novelty in the electrical grid, the drinking water system, sewers, roads, bridges, rail lines, telecoms .. you want this stuff to be absolutely rock solid and not novel in the least. That's what maybe 90 or 95% of FOSS is like, certainly the stuff that needs reliable corporate maintainers.

For corporate-employed FOSS maintainers working at a firm with these "growth and novelty" incentives -- CEMBIs -- this leads to a real quandry. They're in a position where their job performance is very likely to be evaluated in terms of visible growth and novelty (it might be dressed up in more abstract terms like "real-world impact" or "visibility" but it still means the same thing) even though that is exactly the wrong thing for the health of the project they're maintaining. The incentives are bad. If they do the best thing for the project as infrastructure -- triage and fix bugs from the backlog, optimize performance, increase security and reliability, pay down tech debt, simplify and automate ongoing maintenance -- the bias of their organization is likely to view them as "doing nothing", or at best doing "low-value work" that only counts as "reducing fixed costs", not leading the way towards new growth. To be seen in a positive light by their employer, the CEMBI winds uphaving to do essentially anti-maintenance work: make the program "do something new and exciting", that it didn't do yesterday. Ignore maintenance of "what is", focus on "what's next".

Seeing this over and over in FOSS makes me a bit sad. I guess I've maybe been watching too many re-runs of The West Wing lately, but while I've been thinking about this subject, I keep thinking of the "Let Bartlet Be Bartlet" episode near the end of the first season, where the characters come around to the idea that maybe it'd be best to just do what they were elected to do. I keep thinking it'd be nice if employers could incentivize their employees to do what they were hired to do. To "Let Maintainers Be Maintainers". Maintaining FOSS isn't low-value work. It's essential work, stuff that you ignore at your medium and long-term peril. As a new homeowner I will make more salient analogies: it's testing the wiring for faults, testing the walls for mould, replacing the roof before it leaks. It's work that has to be done in order for FOSS to keep functioning over the long term. Software actually does "break down and wear out": requirements change; upstream and downstream platforms and libraries change incompatibly; patterns of usage change; hardware changes; new subsystems become performance bottlenecks; new bugs are discovered, some of which will be high severity security vulnerabilities, others will merely grow in importance as they're encountered more often. Software maintenance is real and important work, and if a company hires a maintainer of a project "in order to support the project", it's what that company should be incentivizing and evaluating those maintainers in terms of.
graydon2: (Default)
On a particularly stressful day this holiday I decided to distract myself by looking into the state of amateur radio (and adjacent issues). I don't remember what specifically set it off, but I fell into the rabbit hole for the next 48h or so and opened a million tabs and read a bunch and learned enought that I figured I'd make some notes here for future reference to myself / picking up if I'm ever interested again.

Reader beware: amateur radio is the natural home of some of the deepest turbo-nerds on earth, and it gets .. a bit heavy.

buckle up )
graydon2: (Default)
I was asked over on mastodon whether I had any recommendations for studying either the period of computer history post-WW2 (thus "post invention of stored-program digital computers") or even pre-WW2 (thus "all the random stuff we used before then").

I do! I've put together a short list over here which, conveniently, includes a lot of links to immediately "borrow" digital copies from archive.org. Also if you follow outbound links from those, or load them on a book-recommender site, you'll probably find a lot of good related titles.

Three caveats:

  1. That list is very US-centric and there are big, important lists I haven't made that cover computing history as it developed elsewhere. Big chunks of it happened in Europe and Asia, and I am sadly very ignorant of those histories still.

  2. I haven't read all of them -- some are on an overlapping to-read list or glaring at me from the bookshelf -- but they're all well enough reviewed that I think they're worth including.

  3. They're very scattershot, following threads I was interested in at one time or another, not a systematic attempt to "learn subject X" completely or faithfully.


Given those caveats, I will also distil a few observations from what I have read, that might help frame what I pursued or stimulate the potential reader's curiosity:

  • There is a huge institutional aspect to early computing. Governments and the military foot the bill for almost everything, and giant financial institutions (banks and insurance companies) do the rest. So it is important to understand what the institutions are interested in.

  • The birthplace of a lot is WW2. There's no getting around it. There are vivid and lasting developments coming out of often-separate groups concerned with ballistics, intelligence and logistics, code-breaking, radar, and atomic weapons simulation. We are all basically working today with derivatives of the IAS machine -- any time someone describes a current machine as having a "Von Neumann Architecture" you can just mentally apply the footnote "atomic weapons simulation machine".

  • There's a lot of tech-transfer of military stuff. Post-war, a lot of labs and R&D groups that got war funding went into business commercializing their stuff.

  • There are two very distinct historical threads -- accounting uses and scientific uses -- that literally build different machines for a long time! Business machines and science machines are separate product lines. COBOL and FORTRAN. The IBM "360" project (along with PL/I) is, among other things, a merger of the two product lines into a single one. We still see echoes of this division in (say) the presence of separate binary floating point (science) and integer-and-BCD (business) instruction sets and datatypes.

  • In the US there's a big east-coast / west-coast split based on trajectories of specific people and research groups. And the east coast is huge for most of the history! Route 128 is where the action is at for a very long time. Stanford's postwar commercialization of microwave research on the west coast and Shockley's subsequent move to set up shop in Mountain View set in motion a long shift in the center of gravity, but even today MIT is still MIT for a reason.

  • Besides the east-west split there's also a fascinating and very significant US-midwest (Minnesota specifically) trajectory to follow around a decommissioned US naval codebreaking group called ERA that is probably the most vivid "huge in the past, forgotten by today's youth" contrast story. This is the group that hired Seymour Cray, worked on contracts for the office of naval research for a while, bounced around ownership by some east coast companies for a bit but eventually settled into being "Control Data" for a while before forking off again as an impatient and independent Cray Research.

  • There are two very large and very long-lived companies that dominate the history, that again we tend not to think too much about in recent years: IBM and AT&T / "The Bell System". It's really hard to overstate how dominant these firms were. For much of the 20th century -- and these really are century-old companies, founded in the late 1800s! -- "computer" was really synonymous with "IBM" and "telecommunication" was synonymous with "AT&T/Bell". Nobody could come anywhere near them, everything else was just a rounding error in terms of scale. The fact that both of them lost control of their natural territory around the birth of microcomputers is part of what gives that era such a different flavor (along with, obviously, the temporary transition to a personal / home / hobbyist stance -- much of which has unravelled in the post-2010 modern Big Tech era)

  • Older computers were physically huge and were constantly struggling with physical problems: heat like today, but also power, weight, faulty parts, assembly and maintenance costs. And they were often changing technology as new types of memory and logic gates gained the advantage: electromechanical machines, vacuum-tube machines, discrete transistorized machines. The consolidation on integrated-circuit microchips (where the whole computer is one tiny slab of silicon) took a very long time to occur.

  • What we nowadays think of as "online culture" or "cyberculture" did incubate on hobbyist BBSs and usenet and such in the late-80s but it actually has its origins in much earlier student use in the 70s and even 60s of institutional time-sharing systems intentionally repurposed for education -- PLATO and DTSS. It wasn't all stuffy science-and-accounting! People were sharing recipes and jokes, teaching each other crude programming languages, arguing politics, playing games and finding love online using mainframes and remote terminals decades before they were doing so with microcomputers and modems.

  • As I've mentioned before, any history-reading is necessarily partial, incomplete, biased, overlapping and messy. You will only get part of any picture, and you can follow endlessly-earlier to find more and more material, of sketchier and sketchier certainty and more and more author-imbued bias. Studying early computing can easily drag you back past 19th century mechanical tabulators and telegraph-operator culture all the way to the deep history of logic and philosophy, where people have struggled with "formalizing thinking" (for various reasons) for centuries. Embrace the depth of it! But also don't imagine you ever know it all; anything new you learn that's worthwhile will expand rather than contract the set of things have left to learn.
graydon2: (Default)

But to see mechanization and automation purely as a problem in comparative cost is greatly to minimize their role -- and to pay further for the error of confining economic goals, and economic calculation, to profit maximization. The technostructure, as noted, seeks technical progressiveness for its own sake when this is not in conflict with other goals. More important, it seeks certainty in the supply and price of all the prime requisites for production. Labor is a prime requisite. And a large blue-collar labor force, especially if subject to the external authority of a union, introduces an element of uncertainty and danger. Its cost is not under the control of the technostructure, although in the planning system there is, of course, the power to offset labor cost changes with price changes. There remains the risk and consequences of a strike.

In contrast, mechanization adds to certainty. Machines do not yet go on strike. The prices are subject to the considerable certainty which, we have seen, is inherent in the contractual relationships between large firms. The capital by which the machinery is provided comes in large proportion from the internal savings of the firm. More white-collar workers and more members of the technostructure will be required with mechanization. But white-collar workers tend to identify themselves with the goals of the technostructure with which they are fused. Such is the result of replacing twenty blue-collar workers with two men or women knowledgable in computers.
Page generated Sep. 1st, 2025 07:32 am
Powered by Dreamwidth Studios