Threads for icefox

    1. 1

      I’ve done this, it’s fun. I have a VPS that I use as a terminal server so that I can always access Whatever I Happen To Be Working On, so a lot of the stuff has already been set up over the years and my infrastructure is already built around it. Web is indeed the killer, however for me a close second is reading PDF’s. I semi-regularly end up wanting to read papers and I just cannot find a good solution to doing that through SSH.

    2. 3

      A little while back, I wrote that we shouldn’t say “auth” but should use other terms instead.

      And in the linked article they propose “permissions” and “login”. Holy shit, yes please. I have a new favorite author.

    3. 7

      It can be annoying at times, but sometimes it can be really productive to have someone in the room that’s willing to be a stubborn pain in the ass about making people explain/define unfamiliar terms, clarify ambiguous ones, and use them consistently. Especially when people are speaking from a number of very different contexts.


      A case of lower-tech terminological chaos that always amuses (and occasionally exasperates) me is how slippery the notion of what a/the product is can be when people in the room span a lot of different contexts like content production, development, marketing, managing content platforms and web stores, accounting, and strategic decision-making.

      1. 4

        Agreed. And besides being a stubborn pain in the ass, it’s hard as hell to not give in to “I shouldn’t ask about X or else people will think I’m stupid.” One of the hardest things I ever did in grad school was learn to swallow my pride and ask, repeatedly, what a term meant when someone said something I didn’t understand, or when they said something I thought I understood but which gave me the “…uh oh, they’re using this word in a way I don’t get” feeling. Standing up in front of a bunch of random professors and giving a conference talk? Baby stuff in comparison. Interrupting the one professor in the world who needs to think you’re smart if you’re gonna graduate, and asking “sorry, what do you mean when you use the term $FOO here?” Most social-anxiety-inducing thing in existence.

      2. 2

        a stubborn pain in the ass about making people explain/define unfamiliar terms, clarify ambiguous ones, and use them consistently.

        Hey, have we met? How did you know my middle name? XD

    4. 1

      Packing more stuff. Moving date in three weeks, aaaaaaaaaa.

      Also I have been possessed by the Dark Hadou again, and started writing a review article of next-gen-ish systems programming languages. Zig, Odin, Hare, that sort of thing. It’s fun!

    5. 4

      It’s almost like continually adding new world-changing features without updating or removing old ones isn’t sustainable.

      “In the long run, every program becomes rococco – then rubble.”

    6. 1

      My setup for avoiding RSI, which I suffered from in the past, follows the advice a coder friend gave me long ago:

      • Keep your forearms level and 90 degrees to your upper arms, not reaching out or scrunched in, and not tilted up or down
      • Keep your back straight, head level, not looking up or down
      • Keep your wrists straight, not arched up or bowed down, and not canted inward or outward.

      Everyone’s body is different, but as long as I manage to keep something like this posture more often than not, my wrists, forearms and fingers are comfortable and injury-free. The best setup is one that lets you do this comfortably, and also lets you move around, walk, stretch etc. several times a day. I alternate between a standing desk with a monitor stand I built to my size, and a sitting desk with a decent office chair.

    7. 17

      MULTICS has a hard to interpret but descriptive error message for the extremely unlikely situation when the system ended up with two root volumes.

      I’m not sure if it could be made actually easier to interpret.

      1. 4

        First, I don’t work for Google, so I have not had my sense of humor removed.

        Still, a much better message would be “E3141 - HODIE NATUS EST RADICI FRATER”. That is, it has a clear, searchable link to where there might be random detailed information, or even ramblings. It is a mistake that each detectable, obscure, arcane error needs a professionally produced tome. The tome can be accumulated if the error ever occurs.

        1. 3

          I really would like to add numbered errors to my programs to make them easy to look up. I still wonder what can make it a relatively low-maintenance task, though.

          Someone like Oracle can just throw more human-hours at the problem, I can’t.

          1. 7

            It doesn’t have to be hard. I just keep a text file that lists all the errors (and their numbers). I picked up the idea from a previous job where we had over 1,000 different number errors. Not hard to keep it up once started.

          2. 3

            You could represent all errors in your program as variants of a given type, and then implement an error_code() method on that type that forces each error variant to have a unique code. In Rust this might look like:

            enum Error {
               IoError(std::io::Error),
               UserError,
               SomeoneDidSomethingWrong { code: u8 }
            }
            
            impl Error {
               fn error_code(&self) -> u16 {
                   match self {
                     IoError(_) => 0,
                     UserError => 1,
                     SomeoneDidSomethingWrong { .. } => 2,
                  }
               }
            }
            

            And then you’d simply use this Error type pervasively throughout your code to return errors.

            1. 2

              The usual issue is growing that list. Protobuf uses that approach and for something that will have to remain stable and thus should be carefully designed before you release it to other people, but error conditions are naturally dynamic.

              Suppose I started from something like this:

              impl Error {
                 fn error_code(&self) -> u16 {
                     match self {
                       IoError(_) => 0, // Generic I/O error
                       FileNotFound => 1,
                       UserError => 3,
                       SomeoneDidSomethingWrong { .. } => 4,
                    }
                 }
              

              Then I realize that there can also be permissions issues and now I need to add PermissionDenied. If I add it between FileNotFound and UserError, I’ll have to change all error codes after it, so the error message formatting and the manual will have to change as well. If I add it at the end, it’s no longer grouped with other IO errors.

              One possible approach is to leave gaps and make it like:

              impl Error {
                 fn error_code(&self) -> u16 {
                     match self {
                       IoError(_) => 100, // Generic I/O error
                       FileNotFound => 101,
                       UserError => 200,
                       SomeoneDidSomethingWrong { .. } => 300,
                    }
                 }
              

              Then I can add PermissionDenied as error code 102. Not a zero maintenance approach but might be serviceable enough to try…

              1. 4

                If I add it at the end, it’s no longer grouped with other IO errors.

                This is only a problem if your error codes are hierarchical identifiers whose first digits are categorical and whose latter digits are the specific error. But it’s not clear that that’s a desirable property in the first place: errors can fall into multiple categories, and with enough errors you’re likely to end up with a “misc” category anyway.

                The point of the error code is to be a direct index to the error. If you want to generate documentation where error codes are grouped by topic, you can do that. You could even list the same error code in multiple sections if it’s relevant to multiple. The code number itself is arbitrary.

          3. 2

            I really would like to add numbered errors to my programs to make them easy to look up. I still wonder what can make it a relatively low-maintenance task, though.

            I ended up with codes, not numbers: “system-something”. For example: “contact-notfound”. Easy to grep for, and having the code tells me the error already.

      2. 4

        I’m so tired of this kind of humor. When an error message comes up, I am already in a stressful situation. The last thing I want in that case is another riddle on top.


        The programmer who put a Latin error message in that place apparently cared more about their own fun than about the other people who have to deal with it.

        Then again, maybe (probably) I’m complaining unfairly. First of all, I have no idea how often thing like Latin error messages were put in the code. And the linked page at least does not indicate that the afflicted technicians appreciated this humor; so maybe everyone else was as tired of this back then just as I am today.

        Second, that XKCD comic (and many others) shows again that nerds can be humble and self-reflective; and given XKCDs popularity it probably even educates people to this nicer behavior. That’s a nice progression from the seventies to today.

      3. 2

        I dunno, seems easy enough to say something like “SSTN recovery code had a brother pointer at the bottom of the tree when it was finished, this should never happen”. That’s a longer message to shove into the code, sure, but you could always just give it an error code and print it in the manual.

        1. 6

          A good error message for the user should give at least some hints about possible further steps. There’s certainly a “spectrum of helpfulness” depending on how much the program knows about its environment and how many external factors are involved:

          • No space left on device — free up some space on the device.
          • Unrecognized option --forbnicate, did you mean --frobnicate — correct the typo.
          • Unrecognized option --frobnicate — check the docs for option syntax.
          • File not found — check if you point the program at the right path, or check why the file is not created where you expect it.
          • ICMP host administratively prohibited — this is a network problem, caused by a policy, you should check the configs or contact the network admin.
          • Connection reset by peer — uhm, try to reconnect maybe, but at least you can be certain it’s not a problem on your side.

          Beyong that there’s basically the “check the source code or contact the developer” territory. Say, assertion error: the system has a negative number of logged in users. The only way it can happen is if there’s a flaw in the program logic.

          And then there are problems like the duplicate root pointer. From that article, my impression is that no one understood how and why that condition could occur or what exactly a possible fix could be, so a message that didn’t need translation from Latin wouldn’t bring the user any closer to a solution. In that situation, the only thing people can do it either contact the developer or grep the source code for the error, and for that purpose, any unique message is as good as any other.

          1. 4

            Yes, the HODIE NATUS EST RADICI FRATER error message was a “an impossible situation” from Multics’ point of view. As it turns out, the problem was in the hardware; specifically the system had been mis-configured in such a way the the root disk appeared twice as distinct physical devices. Thus the “brother” is a second root disk, a situation which cannot arise as part of the Multics software configuration; and lack of debugging guidance from Multics is quite understandable.

            1. 3

              Obviously here the error can’t say exactly how to fix the issue but the fact that 2 teams had to grasp at straws should be all the proof you need that it should state the “impossible” situation clearly. That would’ve narrowed what to look at for them, just like the phone call did.
              And nothing prevents you from having the joke and then a real message.

    8. 10

      That Djijkstra quote is great.

      We should do our utmost to shorten the conceptual gap between the static program and the dynamic process.

      It feels like a good way to explain and justify a bunch of approaches feel correct, but would otherwise be purely preferential. One I’m particularly fond of is arranging code in natural reading order from top to bottom rather than the opposite (C style). There the static program and dynamic process seem in sync.

      1. 13

        What’s the natural reading order, though? If you read the top routine first, you don’t know what the subroutines do yet, so you can’t fully understand the function until much later, when you actually read the subroutines. And if you read the subroutines first, you don’t know what they are used for, and why you’re even bothering.

        What happens in practice is that we start at the top level, and then constantly look up the API of each subroutine it uses (assuming it is properly documented, my condolences if you need to dig up the actual source of the subroutines).

        That ultimately is the “natural reading order”. Not top down, not bottom up, but jumping around as we gather the information we need. And I don’t know of any way to write code that linearises that. The best we can do is have our IDE pop up a tool tip when we hover the subroutine name.

        1. 6

          I think the main thing that’s missing from this article is a discussion of the value of guarantees. If I know the code doesn’t use goto, then it usually makes it much easier to read, because I don’t have to constantly be asking “but what if the flow control does something unexpected?” which is really distracting. You give up a certain amount of flexibility (the ability to jump to an arbitrary point of code) and in return you gain something much more valuable (code becomes easier to read and understand). Languages without early return offer similar advantages.

          Reading order has the potential to offer a similar (but admittedly less valuable) guarantee. If you have to define something before you use it, then the answer to the question of “where is this thing I’m using defined?” is always “up”, which I have found to be very helpful. After getting used to this, I feel somewhat disoriented when reading code where it isn’t true.

          1. 4

            Huh, I thought the consensus was that single-exit restrictions turned out to be a bad idea, because early exits allow you to prune the state space that needs to be considered in the following code. If you don’t have early exit, you have to introduce a variable to keep the state.

            The main argument for single-exit was to make it easier to work with Hoare logic, but I think later formal systems were better at modelling more complicated flow control.

            1. 1

              I haven’t heard that myself, but I should specify that I’m talking about entire languages designed around not having early returns (scheme, ocaml, erlang, etc; basically languages without statements) rather than using a language designed around having early returns and banning their use.

              1. 3

                Yeah, the problems happen with the combination of the single-exit restriction and a statement-oriented language. Expression-oriented functional languages allow you to return a value from a nested if without having to unwind the nesting first.

                1. 3

                  I think the core problem is join points.

                  In an expression oriented language, your conditionals branch out into a tree, and every branch returns a value, so there are no join points. If you need a join point, you use a (possibly local) helper function.

                  In a statement oriented language, you can do the same thing, where each branch of your conditionals ends in a return. If you don’t, if you let the conditionals fall through to a common statement, now you can’t “look up” and there’s no “coordinate” (loop variable) that helps you orient yourself. Doing single-return pretty much forces you to have join points.

                  While loops also create join points, but the loop condition helps orient you.

                  I think that when break and continue are confusing, it’s more about the join point (I don’t know what’s true after the jump) than the jump itself.

            2. 1

              Huh, I thought the consensus was that single-exit restrictions turned out to be a bad idea

              You can still return early without goto. As for the error handling goto enables, that’s mostly just C missing a defer statement. If you stick to the usual patterns, it’s just a variation on early returns.

              1. 1

                I was referring to languages with loops but without things like break or continue, and when return can only appear at the end of the function.

                1. 1

                  To be honest your comment felt like a non sequitur: the comment you were replying to was making no mention of single-exit functions, either as a coding rule or as a language restriction, so I wasn’t quite sure why you brought it up.

                  Edit: reading Technomancy’s comment for the fifth time, I finally caught that little detail: “Languages without early return offer similar advantages”. Oops, my bad.

        2. 3

          The full quote in Goto Considered Harmful is:

          My second remark is that our intellectual powers are rather geared to master static relations and that our powers to visualize processes evolving in time are relatively poorly developed. For that reason we should do (as wise programmers aware of our limitations) our utmost to shorten the conceptual gap between the static program and the dynamic process, to make the correspondence between the program (spread out in text space) and the process (spread out in time) as trivial as possible.

          For me, when I talk about natural reading order, I’m referencing the general idea that we read left to right, top to bottom (modulo non-LTR languages). In programming, we use abstractions all the time (modules, classes, methods, variables, etc.).

          What’s the natural reading order, though? If you read the top routine first, you don’t know what the subroutines do yet, so you can’t fully understand the function until much later, when you actually read the subroutines. And if you read the subroutines first, you don’t know what they are used for, and why you’re even bothering.

          I find that with reasonable intention revealing names, the first problem you list (that you cannot fully understand the function without reading every subroutine in it) is rarely a problem, because that’s the entire purpose of programming (to abstract unnecessary details in a way which makes it easy to understand large systems.

          What the Dijkstra quote highlights succinctly to me is the idea that spatial and temporal equivalences have value in understanding software. Aligning things code which is defined spatially before with code which happens temporally before is a natural and reasonable approach. The opposite causes the need to read generally downwards when tracing the temporal nature of code, but upwards whenever it calls some sub part of the code. That to me is unnatural.

          That ultimately is the “natural reading order”. Not top down, not bottom up, but jumping around as we gather the information we need. And I don’t know of any way to write code that linearises that. The best we can do is have our IDE pop up a tool tip when we hover the subroutine name.

          It’s that jumping around that’s a problem. The spatial directionality of the jump being aligned with the temporal is relevant. You can make the same argument about this as an argument against inheritance in OOP. It’s effectively the same problem, temporal effects after are defined spatially before, which is inherently more difficult to visualize.

          I’ve been tempted to write a clippy lint for rust that checks that methods in files are arranged topologically sorted (either top down or bottom up depending on the user’s preference). There’s nothing too difficult about the algorithm there. If A calls / references B, then the definition of A belongs before/after (per pref) B in any source code where they’re defined together.

          All that is to say, I recognize that there are elements and rationales for the bottom-up (particularly long held practices, consistency, and familiarity), but where those rationales don’t exist, top-down should be preferred (in my subjective opinion - mostly based on reading / writing code for 30+ years).

          1. 1

            For me, when I talk about natural reading order, I’m referencing the general idea that we read left to right, top to bottom (modulo non-LTR languages)

            I figured that much. I just got past that, and to the problem of writing programs that match this order.

            I find that with reasonable intention revealing names, the first problem you list (that you cannot fully understand the function without reading every subroutine in it) is rarely a problem

            Let’s ignore my first paragraph, which I dampened in my second: “What happens in practice is that we start at the top level, and then constantly look up the API of each subroutine it uses”.

            I can believe that you rarely have to look up the actual source code of the functions you use (if your code base has a good enough documentation, which in my experience has been far from systematic). But I don’t believe for a second you don’t routinely look up the documentation of the functions you use. It has been my overwhelming experience that function names are not enough to understand their semantics. We can guess the intent, but in practice I need more certainty than that: either I know the semantics of the function, or I have to look them up.

            Even if all your functions are exquisitely documented, that doesn’t prevent you from having to jump around to their declaration — either manually, or by having your IDE pop up a tooltip.

            because that’s the entire purpose of programming

            X is true because that’s the purpose of Y is rarely a good argument. First you have to convince me that Y fulfils its purpose to begin with. As for programming specifically… I’ve seen things, and I don’t have much faith.

            What the Dijkstra quote highlights succinctly to me is the idea that spatial and temporal equivalences have value in understanding software.

            So far I agree.

            Aligning things code which is defined spatially before with code which happens temporally before is a natural and reasonable approach.

            It is above all a flat out impossible approach. Not just because of loops, but whenever you call a function, the documentation of that function is either above the current function, or below. And that’s if there’s a documentation at all: many internal subroutines (including a good portion of mine) are utterly devoid such.

            Unless the name of the function is enough to reliably guess its purpose, which is often not the case at all, you have to look it up. Or down. And then you need to jump back. So unless you can write your subroutine right there where it’s called (something I actually encourage when the subroutine is called only once), you can’t avoid breaking the natural reading order. Because program flow is fundamentally non linear.

            You can make the same argument about this as an argument against inheritance in OOP. It’s effectively the same problem, temporal effects after are defined spatially before, which is inherently more difficult to visualize.

            Aahh, you may be conflating things a little bit. The real problem with inheritance, and many bad practices for that matter, is less the reading order than the lack of locality. I have an entire article on locality, and inheritance is fairly high on the list of things that breaks it.

            Now sure, having a way to organise your program that makes it more likely that you find the code you need to read, even if it doesn’t decrease the physical distances between two related pieces of code, does decreases the mental effort you have to spend, which is the whole point of locality to begin with. I still believe though, that keeping code that is read together written together is more important than choosing which piece should be above or below — as long as it’s consistent.

        3. 3

          What’s the natural reading order, though?

          Exactly. It’s different for different people, and also different depending on what you’re actually looking for in the program you’re reading.

      2. 4

        As a mathematician, I find it much more natural to define things before using them.

    9. 5

      It seems weird to me to describe traits as not good enough because gluing together disparate libraries requires newtype wrappers, while proposing a replacement with grotesquely ambiguous semantics based on what could be generously described as academic navel-gazing.

      This claim in particular:

      I’m sure it won’t take much to convince you; [newtype wrappers are] unsatisfying. It’s straightforward in our contrived example. In real world code, it is not always so straightforward to wrap a type. Even if it is, are we supposed to wrap every type for every trait implementation we might need?

      In fact it will take a lot to convince me that an extra couple lines of mechanical glue code is worse! Yes, even if you end up wrapping multiple traits!

      The lack of orphan instances in Rust can be frustrating in times when a single library contains multiple crates, especially when I’m doing so to avoid a dependency on alloc (the lack of orphans means I can’t use Cow), but I’ve never wished for making method resolution more implicit. I’ve especially never wished to redefine it such that the programmer has to wade through hundreds of pages of type-theoretical gobbledygook to figure out why cmp() is returning the wrong value.

      1. 5

        Yes, that’s… why it’s a local maximum? It works really quite well, so it’s hard to find something better without first making something worse? But if you’re gonna be exploring design space that hasn’t actually been explored much, then understanding the tradeoffs involved are kinda important. Is it worth it? Probably not yet, but people keep reinventing modules and traits have some very concrete downsides. Enforcing coherence needs whole-program type information, as they demonstrate coherence can still have holes, and so on. So, let’s get off our high horse as if typeclasses don’t originate from a couple decades of academic navel-gazing and hundreds of pages of type-theoretic gobbledygook, and poke around to see if we can make something useful out of this interesting variation.

        If you knew that it was gonna work already then it wouldn’t be science, now would it.

        1. 1

          Yes, that’s… why it’s a local maximum? It works really quite well, so it’s hard to find something better without first making something worse?

          To be a local maxima is to know that a better solution exists elsewhere – it doesn’t necessarily have to have been fully described, but it must exist. The author’s claim that traits are a local maxima is equivalent to claiming that they have found a non-trait mechanism for type->method scoping that is better than traits in every way. The rest of the post fails to support that claim.

          In particular, this part:

          We can imagine we have a global scope of traits, and we only ever want one implementation per type in that scope. I’m going to call enforcing coherence in this way: global coherence.

          […] Our issues with traits all orbit around requiring global coherence. Ironically, global coherence is what prevents traits from being a global maxima.

          It’s obvious from here, if global coherence is a local maxima, local coherence is a global maxima. Now all we have to do is figure out if, and what, local coherence is.

          is nonsense. The author assumes axiomatically that a globally-consistent mapping of types to methods is undesirable and proposes something called “local coherence” (based on … grammatical negation??), then goes off on a hunt for whatever that might be. They haven’t tried to figure out why they think globally-consistent traits are undesirable, they haven’t tried to figure out something better and then named it; they started with a name and then tried to go backwards to identify a concept.

          The problem with their approach in this case, of course, is that in a nominative type system you do want a globally consistent association of types and traits, because otherwise the same code with the same types in different modules might have different behavior.

          people keep reinventing modules and traits have some very concrete downsides.

          It’s possible that traits have downsides (compared to … what?), but the author hasn’t identified any of them. The best they do is gesture vaguely in the direction of requiring boilerplate when combining libraries, which isn’t convincing in Rust (a language that is full of boilerplate).

          If the author wants to explore actual downsides of traits, then good starting points might be:

          • Traits form a parallel “is-a” hierarchy separate from “contains-a” value types, which makes it difficult to wrap libraries designed for OOP languages where those aren’t clearly distinguished.
            • For example a UI framework might say that a ToggleButton extends Button and implements Widget, such that toggle_button.click() is (toggle_button as Button).click(), but this layout is difficult (or impossible) to represent in Rust because Button can’t be both a trait and a struct.
          • The question of “sealed” traits, where implementations of a public trait can only be defined within the library that defines the trait. Sealed traits are useful because they act like a locally-extensible implicit tagged union.
            • Haskell and Rust both require a sort of scope hack (the public trait depends on a non-exported parent), which can interfere with type inference and cause accidental un-sealing if the internal trait ends up in a public module.
          • In Rust, adding optional methods to a trait can be a backwards-incompatible change if the method name clashes with an inherent method of a type the trait is implemented for.
            • This doesn’t affect Haskell because it doesn’t have value-scoped function resolution, so it’s more a problem of Rust’s syntax rather than traits themselves, but it could be solved by requiring the trait methods to be brought into scope (or otherwise unambiguously referenced).

          Note that none of these are related to the author’s wish for local scoping of trait implementations.

          Enforcing coherence needs whole-program type information, as they demonstrate coherence can still have holes, and so on.

          They demonstrate no such thing. They link to a GHC bug in which Haskell’s poor design leads to unexpected behavior, but that’s a problem with Haskell allowing orphan instances in -X Safe code, not with the concept of traits in and of themselves.

          So, let’s get off our high horse as if typeclasses don’t originate from a couple decades of academic navel-gazing and hundreds of pages of type-theoretic gobbledygook, and poke around to see if we can make something useful out of this interesting variation.

          The origin of an idea is unimportant.

          I don’t need to read any papers on type theory to understand the behavior of Haskell’s class or Rust’s trait, and the concept has been successfully implemented in multiple languages.

          In contrast, the author’s proposal of local implicit bindings of type-parameterized methods seems to exist only in the form of 84 pages of prose, which is within epsilon of being scrawled in crayon during an LSD trip.

          If you knew that it was gonna work already then it wouldn’t be science, now would it.

          I don’t see any science happening in this blog post, and I reject the idea that using word games to craft unanswerable questions is science in any sense.

    10. 1

      Working on packing and planning for moving. @_@

      1. 1

        Good luck on your new place!

    11. 9

      One thing that Rust can’t optimize nicely is this pattern:

      a[0] + a[1] + a[2] + a[3]
      

      because each read has its own bounds check, and each check reports the exact location of the failure, so they are all unique jumps that can’t be merged. You get 4 branches, 4× code bloat for error message setup, and miss out on autovectorization.

      I don’t know how an optimizer could improve this. Usually it’s not safe to reorder code around safety checks. If there are any writes happening at the same time, the partially-written data may be observable and make it even semantically impossible to optimize out the redundant checks. Currently this needs coalescing the checks manually with code like:

      let a = &a[..4];
      
      1. 7

        Oh, that’s an interesting approach to doing the coalescing. I’d always just seen it done with assert!(a.len() >= 4). Looks like they generate basically identical code, which is nice~

        I wonder if it could be improved by relaxing the “report the exact location of the failure” constraint? So it wouldn’t report “out of bounds access in a[3]”, but might report “out of bounds access in a[0] + a[1] + a[2] + a[3]” or such…

        1. 5

          I came here to post the assert!(a.len() >= 4) version, haha. That’s the way I’ve seen this done.

      2. 3

        I guess that release builds could have less informative error reporting so that the error paths can be coalesced?

        1. 7

          Even if I replace the panic with a direct call to abort() or even loop {}, LLVM still chooses to have four branches:

          https://rust.godbolt.org/z/9ehfEbhao

          1. 5

            That looks weird. LLVM should be able to handle that. Might be worth extracting a test case of LLVM IR and looking at why it isn’t getting fixed. Just looking at the LLVM IR view in compiler explorer, I would expect LLVM to hoist and merge the checks easily.

          2. 3

            It seems like it’s assuming that the branches would be predicted false by the CPU so while it would cost it would cost relatively little. It’d be nice if it managed to figure out that it could choose to compare to 4 and if lt’s less than then use a lut to figure out what to pass to panic_bounds_check?

          3. 3

            How disappointing! I expected better than that. Thanks for making that example.

      3. 3

        The OCaml compiler intentionally generates a vague error message on array access that does not depend on the failing index, to make it easy to share all failure cases in a single non-returning instruction, to emit fast code. On the other hand, it does not try factoring the bound checks together, so we still have four checks where one would suffice. (But does it matter in practice, if those checks are accurately predicted?)

        1. 8

          does it matter in practice, if those checks are accurately predicted?

          It may not matter when it’s reading big objects or through an indirection. However, in numeric code it matters quite a lot, because the primary cost is not in the branch itself, but in lost optimizations. The branches with side effects prevent loading data in larger chunks, and kill autovectorization and most other loop refactorings.

          1. 2

            It looks like gcc and clang still emit multiple jumps here? It’s been a while since I’ve done asm though so it’s entirely possible I’m misinterpreting this? https://godbolt.org/z/WxYd5KhMd

      4. 2

        What’s the difficulty here? Maybe I don’t understand how the bounds checking / error reporting works, but could you not coalesce them like this:

        if (a.len < 4) {
          // Do detailed error reporting here based on len and where it would fail
        } else {
          // Do unchecked sum here
        }
        
        1. 6

          I can, but I’d like the optimizer to be smart enough to do it automatically. I suspect it’s really hard to explain to the optimizer that the bound check is both a side effect so big it can never be ignored, but also so insignificant, that it can be reordered and coalesced.

          1. 2

            Ok, it sounded a lot more like a fundamental thing in your first comment. This seems much more like an optimizer issue, maybe with branch hints (you make the error paths cold) and somehow hoist the checks into the cold path and the hot path has the coalesced checks only? That seems doable, even at the LLVM level. (This is assuming you want the same guarantees of exact error reporting).

            (I checked and get the same problem with Zig btw: https://godbolt.org/z/PW8sYPe5d)

    12. 5

      Recovering from covid, hopefully.

      Also working on packing and planning to move, if we have energy. Partner and I have decided on a house to rent in Massachusetts, so it’s pretty settled. I wish the house were 20% larger, but I think it’s still a net win. We can settle down there and get comfy near friends and family, then work on being ready to flee the country when it becomes necessary.

    13. 16

      Blowing off steam. I received a great offer from a company this past week. Went to visit and discovered they had pulled a bait and switch to make me back out of another interview pipeline. They wanted me to do rounds of technicals after they had already extended the offer. Now I find myself unenthused with this company and have likely upset the other company whose interview I backed out of. I’m not very happy about the whole situation. I’m ready to find a new job, but this market is killing me slowly.

      Besides seeing friends, I’ll probably take this weekend off from working on my game to play a game and focus on having some fun instead.

      1. 13

        Agreed, fuck that. I’d go to the other company, tell them exactly what happened, and ask if they still want to interview you. How easy it is for them to say “yes” depends on their own bureaucracy, but saying “sorry I got a great offer” followed by “this offer turned out to be lies” generally isn’t the sort of thing that will burn bridges with a hiring manager. Every single one worth anything knows they’re not the only job you’re looking at, and that chaos is part of the process.

      2. 8

        Man fuck all that. I would not take a gig where they’re straight up deceiving you in the first handful of interactions.

        Any chance you can clue us in to where we ought not apply?

        1. 4

          It’s a relatively small company in the bay that you probably wont run into. After talking on the phone with them earlier, I genuinely believe it was just a poor mistake from inexperience on their behalf. I was their first proper external recruit, and I think being a dev and cpa excited them for a role where that would be a strong fit. When they realized I was already about to sign with another company, they panicked and made the wrong move trying to win me over. They could have handled it much, much better in my opinion, but I’ll give them the benefit of the doubt that they’ll grow from this experience. I don’t believe it reflects on their character, so I’m not going to name and shame.

          Nonetheless, I now find myself having gone from two competitive offers to zero, which sucks. Turning down my biggest salary offer ever was a difficult decision to make; especially while the job market is so tough. But I didn’t feel comfortable proceeding with them after that experience. Hopefully a better opportunity pops up!

          1. 5

            Proud of you. You’re worth that check and also respect.

    14. 27

      Yeah, haven’t had this issue at all, or if I did it was a really long time ago. I wonder if maybe some people are over-using lifetimes or async or something. I worked on a multi-year O(10k) LOC rust codebase across a team with varied experience and things like this just weren’t issues more than once or twice and never required a re-architecture, at most it was like an Arc or Clone or something that we’d “todo” for later.

      If you end up not needing it, you need to go through and delete every instance of this lifetime, which can sometimes be 30 or more generic statements that end up needing to be modified.

      This is true of generics as well, and it’s annoying but honestly it’s a tooling issue and I wouldn’t be surprised if a tool like cargo-fix could do it today/ at least has the capabilities to do it.

      . These constant refactors have been a major detractor for the language for years

      So I think we can maybe agree that Rust lets this happen, but what I’m wondering is if maybe you’re not learning from the experience. Again, I just haven’t run into this. Is it a different domain? I write web services, CLIs, etc.

      This means, to be a highly productive Rust programmer, you basically have to memorize the borrow checker rules, so you get it right the first time.

      To be a highly productive anything you need to memorize the internals. I think the more interesting question is probably what it takes to be minimally productive. Rust probably has a bit of memorizing to do to be minimally productive but, again, I’ve consistently found that just using Arc and Clone upfront solves it.

      The very fact that ‘Rc’ was mentioned here makes me think that there’s something backwards going on. I almost never use Rc unless I’m trying to write highly specialized and optimized code where I know the exact context of how it will be used already (allowing me to drop the Send flexibility), and I honestly forego lifetimes often as well until the code has stabilized a ton or unless the lifetimes are trivial (bound to a single function).

      This sounds to me like someone is trying to write highly type oriented, optimized code and then finding that when their constraints change they have to back out those optimizations. I would recommend switching that. Use Clone and Arc liberally and, once your product works, optimize where needed. This is what was done at Dropbox, the attitude was always to just Clone and use Arc, and Dropbox had async/await macros wayyyyy before the language had all of the niceties around that and they were building pretty optimized, correctness oriented code. Rust is extremely fast relative to other languages even when you’re lazy about optimizing so I’ve really not found this to be a bad way to go.

      These things can and should get easier, to be clear. Rust is not perfect. I just wonder, in lieu of concrete information here, how much of this can be avoided by not optimizing your code until it’s stable.

      1. 12

        I wonder if maybe some people are over-using lifetimes or async or something.

        I had this problem a lot in the beginning and it was partly due to over-using lifetimes. A lot of the introductory information for rust, at least at that time ~2019, was very focused on doing everything possible at compile time. Rust has fantastic runtime borrow checking too but very few learning resources explain how normal it should feel to rely on them. I now think “never use explicit lifetimes in your applications” is a reasonable guideline for most software but that is almost heretical to many rust projects.

        1. 8

          Same. The stdlib is also a kinda horrible teacher for how to do things in practice, it puts a lot of work into letting you have lifetimes everywhere ’cause it sorta has to be able to.

          I think that never using explicit lifetimes is going a bit far, but if you need one it’s definitely a sign to stop and ask yourself if what you’re doing is really worth it. They are startlingly sticky and make refactoring a lot harder, so you’d better be pretty confident in your overall design.

        2. 4

          Interesting. I learned Rust in 2015 and I have no idea what the community view was, but I’m assuming it was “just clone it” since that’s what I learned. But I never read the book (it came out later) etc.

      2. 5

        10k is pretty small, I maintain multiple hobby Rust projects which are larger than that. I’d say a project of around 500k to 1 million LOC would be a good one to draw from.

        1. 2

          I don’t really think it matters. It was close to 100KLOC, maybe larger at various points but I don’t recall its maximum size and it’s a multi-language project - I put it on the order of 10K because that’s typically where it fluctuated iirc. I’d say it matters far more that there were multiple people of varying skill and experience working on it and that it crossed various domains of low vs high level code.

          Do you suppose that the author is working on 500k-1M LOC code in the article? Maybe at 5x or 10x or 100x the size Rust starts to change radically but I sort of doubt it; although I’m sure that many things change radically I don’t think the issues they’re running into are it.

          1M seems really high to consider, honestly. Chromium and the Linux kernel are over 10MLOC and those are extremely large projects, so do we really suppose the author is within an order of magnitude? Ripgrep is 50kloc. Sled is 30kloc. Is the author doing something vastly more complex?

          1. 4

            I do think things like having to add lifetime params everywhere get significantly worse as the codebase gets bigger.

            1 million is big but there are plenty of Rust codebases which are that size or bigger. At Oxide, our largest repository is around 400k loc of Rust (including tests but excluding comments and blank lines), and our overall product that ships is likely around 1 milion loc of Rust that we wrote, plus another few million lines of dependencies. Rust excels at large-scale development so it’s not a surprise.

            Ripgrep and sled are great tools/libraries, but they’re not large services. Nextest is around 34k loc of Rust as well, and that isn’t a large service either.

            1. 2

              I do think things like having to add lifetime params everywhere get significantly worse as the codebase gets bigger.

              That makes sense… I just don’t see how else one would resolve the tension between “you’re using a non-owning data type, with non-owning defined as ‘not in control of the data’s lifetime” and the design decision that “the function/struct/enum is the unit of API stability”.

              You chose a type whose semantics inherently become part of your external API and then you’re saying that bubbling API breakages through the call stack gets worse as the codebase gets bigger.

              The only difference in C or C++ is that the compiler won’t force you to update the Doxygen comments on your function/type definitions.

              Outside of making rust-analyzer’s refactoring support smarter or not using borrows, I don’t see what else is possible in a language that gives you the power to choose such a data type.

              1. 2

                I think the answer here is to make automated rewriting better, yeah. But I agree in general that the cost has to be borne some way or another.

      3. 3

        cargo clippy --fix can do a lot of these mundane things by the way. I think removing unnecessary lifetimes is one of them.

    15. 6

      The Quest For Nice Tree Transforms seems to be a common one among compiler nerds. Now, most gang-of-four design patterns strike me as “things you do when you don’t have a sufficiently powerful language”, but it turns out that at least for Rust, the visitor pattern, when implemented a certain way, solves this very nicely. A little tedious to get all the boilerplate written the first time, but after that it pretty much Just Works.

      The trick is to have a visitor with visit_thing() methods that you can easily overload to implement what you want, and then have walk_thing() methods which do one step of recursion and call back into the visit_*() methods. Upon inspection it looks like their fmap() approach does the same thing, which is pretty metal. When I (and others) tried a functor/recursion scheme style pattern I had a map() function that did all the walking and each transformation just returned a new node without having to call fmap(), but that ends up horribly rife with special cases.

      1. 7

        It’s been a little while since I read the “gang of four” book, but IIRC the key concept of the design patterns isn’t to explicitly create a class called “Iterator” or a “Visitor” class with a visit() method, but to use the pattern name to describe the design. Admittedly, it’s a difficult to get that message from the original book because it’s steeped in OOP dogma.

        In Lisp, if I do this:

        (defun process-item (some-thing)
            ;; Do other stuff
            (print some-thing)))
        (map nil #'process-item (fetch-some-list-of-items ...)))
        

        I’m conceptually using the Visitor pattern, since the code is visiting each item and performing some action on it.

        It is possible to create a clunky “Visitor” base class (or mixin) and then require sub-classes that override a “visit_thing()” method, but that’s an implementation detail, and not required to use the pattern.

        I like the idea of Design Patterns, but I think they got a bad rap being so closely associated with OOP.

          1. 1

            It’s not a spellbook, it’s a monster manual.

            Absolutely delightful, thank you!

    16. 2

      The company hired Jepsen to bang on version 0.1.0 of their product? Brave! Smart, too.

    17. 4

      Nice summary. However I think it’s worth suggesting that our ability to write about programming languages has improved enormously over the last 50 years: some of the early reference manuals were appalling.

      1. 3

        The readability of manuals and other written works about programming is an interesting thing, historically speaking. It’s definitely not a continuous upward trend, but especially in that time period, the results varied widely. Algol and its like were pretty much defined with the idea of a separate printed form, so you could end up with proportional fonts and different font weights (e.g. bold for keywords) in a manual, and then not even have the luxury of lower case in your print outs.

        And not every book had the luxury of being properly typeset, quite often you had typewritten reproductions here, too (this was way before TeX or even the Selectric Composer).

        For a French document, there even was a specially designed typeface by famous (I’d argue unsurpassed) typographer Adrian Frutiger!

        Short blurb about the font in German: https://www.degruyter.com/document/doi/10.1515/9783034609890.160

        Recreation as modern truetype font: https://ctan.org/pkg/algolrevived

        For comparison, a typically printed report: https://dl.acm.org/doi/10.5555/1061669.1061677

        1. 2

          Yeah I wonder sometimes about how much of the reputation for over-complexity of Algol 68 was due to poor communication or baroque syntax definition. Someday I kinda wanna make an Algol 24 or something that is just a cleaned up subset of it, maybe prettified a little for modern sensibilities, but that would require me actually understanding it well.

          1. 1

            Well, I think almost any variant of that was done back then. You had the cleaned up definition in 1975 Revised Algol 68 Report, and you had a plethora of simpler follow-up languages, starting with Algol-W and then the rest of the Wirth family, or other branches of the Algol tree like PL/1, Mesa or Ada. (The latter being also quite complex, but with a few years of experience of both writing and reading PL grammars)

            I think a lot of why people considered Algol 68 having “failed” was that Algol-60 was going well enough in their particular fields and they thought that this could increase even further, one language to rule them all. From our perspective right now, it’s in pretty much the same ballpark as almost any other language from back then. Decent amount of teaching use, some commercial/military use.

            A “Algol 68” Lite might be interesting, but you’d really have to pick some features that aren’t already in every other Wirthian language out there. And none of those features should be call by name ;)

            1. 2

              Remember that Algol 60 and Algol 68 are almost completely different languages, both syntactically and semantically. All they really have in common is their committee.

              Algol 60 itself was too small and weird a language to be practically useful, but it had some great ideas which is why it had so many successors so soon. About the only Algol 60 variant that continued to be called Algol past the 1960s was Burroughs Algol, I think.

              Algol W was a precursor to Algol 68. It started as one of the Algol X proposals for a successor to Algol 60 but it was rejected for being too unambitious: the committee preferred van Wijngaarden’s proposal instead. https://dl.acm.org/doi/10.1145/365696.365702

              PL/I was also earlier than Algol 68, its main development being roughly 1964-1967.

              I don’t know of any languages that were much inspired by Algol 68 specifically, other than the Bourne shell and maybe BETA.

              Algol 68 doesn’t have call by name. Probably the only Algol 60 successor that tried to preserve that mistake was CPL.

              1. 1

                I meannnn, Pascal was definitely inspired by Algol 68, though in a “let’s not do what Algol 68 did” sort of way.

        2. 1

          ALGOL-60 was, of course, firmly in the six-bit-character era, and that had implications when a manufacturer based at least its internal documentation (for staff and captive customers) on line printer output.

          However leaving aside the typesetting, but when it comes to describing the semantics of the language… consider this beauty from the mid-60s:

          The arithmetic expressions following the delimiters THEN and ELSE may also be conditional arithmetic expressions. As a result, a conditional arithmetic expression could contain a series of IF clauses in the expression following either or both of the delimiters.

          In the case of a conditional arithmetic expression following the delimter THEN, the Boolean expression(s) in the IF clause(s) are evaluated from left to right as long as they yield a logical value of TRUE. If they all yield a logical value of TRUE, the expression following the last delimiter THEN is executed, thus completing the evaluation of the whole expression. If any of the Boolean expressions yields a logical value of FALSE, the expression following the corresponding delimiter ELSE is executed.

          In the case of the conditional arithmetic expression following the delimiter ELSE, the respective Boolean expressions in the IF clauses are evaluated from left to right until a logical value of TRUE is found. Then the value of the succeeding arithmetic expression is the value of the entire arithmetic expression. If no TRUE value is found, the value of the whole expression is that of the expression following the last ELSE.

          In nested IF clauses, the first THEN corresponds to the last ELSE, and the innermost THEN to the following (i.e., the innermost) ELSE. The delimiters THEN and ELSE between these extremes follow the logical pattern established, i.e., the next outermost THEN corresponds to the next outermost ELSE, and so on until the innermost THEN-ELSE pair has been matched.

          Appropriate positioning of parentheses may serve to establish a different order of execution of operations within an expression.

          Now that’s something which is, in retrospect, strictly correct. But considering that it was aimed at the first generation of programmers, and in particular since it lacked any form of example which used indentation and markup to make its intention clearer… well, suffice it to say that we’ve learnt a lot over the last 60 years and I’m very glad to say that I’ve not seen anything written like that for some while: for which we can credit people like Kathleen Jensen.

    18. 3

      I found this write-up which is more detailed. https://accu.org/journals/overload/26/148/james_2586/

      A highlight:

      A declaration associates an identifier with some value, and uses the ‘equals’ symbol = . The following introduces a new identifier called ultimate answer that represents the integer value 42:

      INT ultimate answer = 42

      Note that white space is not significant within an identifier, so this could equally well be written ultimateanswer or ulti mate answ er

      1. 2

        imo that link is good enough to deserve its own submission.

      2. 1

        That was also the case in Algol 60.

      3. 1

        Nice link.

        Algol copied this white space convention from Fortran. Algol and Fortran predated ASCII, and the convention of using the ASCII _ character for separating words in an identifier hadn’t been invented yet. The _ usually didn’t appear in early character codes. CamelCase was also not a thing since you usually only had uppercase characters available, no lowercase. In that context, using space to separate words in an identifier made sense.

    19. 5

      Traveling to sign a lease on a house!

      Possibly embarking on writing Yet Another typechecker for Garnet, as is tradition. More likely, playing with my baby niece and hanging out with her parents. And their enormous white land-cloud of a doggo.

    20. 5

      Wasn’t it already a few years ago?

      1. 18

        I think it was prior to the RIM acquisition, then RIM (who is now BlackBerry) pulled that rug. It might even have been source available that time. It isn’t, so far, this time. (That might have been the second rug pull on that stack, if memory serves…)

        TBH, I think I’d need FOSS to accept something free from BlackBerry. Dev tools and support were enough of a challenge in the BB9/BB10 days, even when working on behalf of a client who was paying them quite a lot, and their agreements were thorny enough, that I don’t think I’d trust the org without a license specifically outlining and protecting my rights.

        And what they’re offering right now is pretty far from that. The license itself practically telegraphs another rug pull:

        7.2 TERMINATION. This Agreement and licenses granted hereunder may be terminated by either Party upon written notice to the other Party.

        Given the history and the very limited nature of the license, it’s hard to imagine spending a lot of effort developing expertise on this platform based on the very limited promise of free non-commercial use.

        1. 4

          They do seem very dedicated to squeezing blood from this stone without doing anything that would make it actually attractive for people to use. “Embedded UI device” like car consoles or smart TV’s seem like they would be a good market… But afaict those already all run Linux. Their opportunity was 10-15 years ago, when embedded hardware strong enough to run Linux was still a little bit of a luxury and there weren’t already vendors who would make it work for you.

          RIP QNX, maybe my kids will play with a public domain version of you someday.

          1. 5

            “Embedded UI device” like car consoles or smart TV’s seem like they would be a good market… But afaict those already all run Linux.

            Yeah, it’s unfortunate for sure. From what I can tell from a little bit of reverse engineering the factory stereo in my truck (2016 Toyota Tacoma) is running QNX. It’s not as flashy as the Carplay stereos in newer models but it is just absolutely rock solid. In the 3 or 4 years I’ve had the truck I don’t think I’ve ever had to fight with re-pairing my phone or any other weird issues that I routinely run into with rental vehicles. It just reliably does what I want it to do every day over and over.

          2. 3

            But afaict those already all run Linux

            Not quite all yet, especially for the more safety-critical parts like the instrument cluster QNX is still a thing. But yes, the trend has been strongly going that way, not helped by the fact that many car makers have shifted to the Android-based offerings. QNX is trying to keep a foothold by making hybrid setups easier: https://blackberry.qnx.com/en/products/foundation-software/qnx-hypervisor

            It seems like they have been seeing the writing on the wall for quite some time, and are mostly trying to make sure to fully ride out the long tail of deployments and people sticking to existing investments.

      2. 17

        a good summary of the open-closed-open-closed saga is on the orange site https://news.ycombinator.com/item?id=42079976

      3. 14

        Then they rugpulled, then ‘open’ under some other form, then rugpull again. “fool me once” comes to mind.