Threads for jfb

    1. 5

      Homebrew updated it for me, and I never even noticed. That’s a good sign!

    2. 4

      Very nice overview. I went the macOS->NixOS direction in 2019 after some 30 years only on macOS (/ MacOS / OS X / macOS X).

      My use case and priorities clearly differ (author & I don’t appear to fully agree on how tool co-adaptation works), so I’m very happy with the choice, but it’s super useful to see insights from the reverse choice.

      On the topic of searchable menus and system-level per-app keybinds, Apple Script is well worth mentioning. While third party apps, like the mentioned Slack, often do not export much, well-integrated apps and ofc. Apple apps are just as nicely scriptable as in the MacOS days.

      Combined with the cli runner you end up with very nice tight integration of high level app functionality.

      1. 3

        AppleScript is one more reason I detest non-native applications on my Mac.

    3. 1

      Lots and lots and lots but primarily, a web browser that is entirely built on top of a system inspired by Emacs – a small number of orthogonal primitives written to go fast and then a mutable, plastic environment on top of it that allows those primitives to be arbitrarily composed, thus serving as a user agent. You could I think sort of get to this with nyxt, I guess.

      1. 1

        One issue with user agents nowadays is that Web is (possibly maliciously) overcomplicated, so it’s very expensive to build a compatible browser engine out of reasonable primitives, and quite complicated to make entire-engine-as-primitive give enough user control.

    4. 17

      I have no use for this project and absolutely 100% love reading every single thing coming out of it. SerenityOS, Haiku, and Dolphin seem to reliably produce some of the most interesting posts and docs. I don’t know what it is, but something about this type of (pseudo)-retro computing just brings out some wonderful geeking out that I hope doesn’t ever go away.

      1. 13

        I think it’s because these are real hackers, in the noble sense. When I read a story about a Rust or Go languages or backend, you cannot shake off the feeling that people are trying to promote themselves, or trying to sell something either directly to users, or to a cloud company. I choose Rust and Go as an example, because these are usually my favorite topic on lobste.rs and the orange website, but Javascript and others are the same.

        On the other side, these people (SerenityOS, Haiku, Dolphin, …) do not give a damn about selling crap. They’re just having fun on their spare time and often on their own dime. This is the same as some other people who might knit or customize their motor vehicule on their spare time. It feels genuine, and it doesn’t feel like somebody is trying to sell you something or promote themselves.

        1. 3

          This resonates with me, yes. These projects have momentum in a way that “reimplement Linux in Rust” can’t, because they’re organically weird and nerdy.

      2. 1

        I understand like 50% of what’s going on in articles like these, but can’t stop reading them once I start

    5. 1

      I work from home basically ~all the time; a coworker local to Toronto and I do sometimes get a shared space, but the overwhelming majority of my time is spent at my desk. I have a Varidesk sit/stand that I usually leave up when I’m done in the evening, and then use standing in the mornings; a Herman Miller Aeron chair; and a good set of peripherals (MoErgo Glove80, Apple trackpad, 2x decent LG 4k monitors). I try hard not to skimp on peripherals, and we have 8Gb/s FTTP (it is SO CHOICE) so I’m constant disappointed by the connectivity whenever I’m out of the house.

      Importantly, I have an office that is separate from the rest of the house (it’s a small bedroom upstairs), so I can draw physical distinction between where I live and where I work. My younger daughter is starting to agitate for her own bedroom, so next year I think we’ll build a serviced shed in the back yard and I’ll move my setup out there.

      The most important things for me are the peripherals – do NOT skimp! – and the separate space. I’ve worked at a desk in my bedroom or in the family room in the basement and it stinks. Maintaining the separation is very important, psychically, to being able to properly Down Tools at the end of a work day.

    6. 15

      Herman Miller chairs are worth it. Mine is over 10 years old, and still good. Don’t waste your back’s health on some cheapo chair that will quickly fall apart anyway.

      External keyboard + mouse + monitor is a must. Laptops will destroy your neck and wrists.

      You’ll need a dock for this, and there’s a fun problem: docks using USB 3 will cause electromagnetic interference that may be super annoying if you use wireless mouse or keyboard: https://www.usb.org/sites/default/files/327216.pdf I’ve had to splurge on Thunderbolt docks to have a reliable setup.

      WFH now equals having video calls. Check that your audio has a decent quality. You don’t need a pro microphone, but don’t use laptop’s mic if it’s poor quality, or too far from you. My room had echo/reverb from windows and hard floor, which made my audio worse. I needed a carpet + acoustic padding to fix it.

      For video it’s important to have good lighting. There are lots of tutorials how to set up the classic side + top + diffuse combination. I’ve used Nanoleaf panels, which allows me to automatically turn them on and adjust color, but they’re kinda pricey and are still limited to only 2.4Ghz Wi-Fi, which makes them flaky.

      All webcams are bad. Even the ones that claim to be high-end are still relatively small sensors with tiny usually plastic lens. I’ve got a DSLR + HDMI capture card ($15 ones on eBay are surprisingly good and low-latency) + mount to have professional-looking video. Many old DSLRs will work great, you just need to look up which ones have clean HDMI output, and sometimes buy a USB-to-battery adapter to run them off mains power.

      1. 4

        Alternatively, phones now have great cameras on the back, and the Camo app from Reincubate turns a phone into a highly adjustable webcam. I got a MagSafe-compatible phone holder that sticks onto the back of the monitor. I had an old iPhone that I leave attached to my USB dock, but you could also just plop a phone in the holder when needed. MacOS got Continuity Camera after I set this up, which works too, but isn’t as adjustable.

      2. 2

        All webcams are bad. Even the ones that claim to be high-end are still relatively small sensors with tiny usually plastic lens. I’ve got a DSLR + HDMI capture card ($15 ones on eBay are surprisingly good and low-latency) + mount to have professional-looking video. Many old DSLRs will work great, you just need to look up which ones have clean HDMI output, and sometimes buy a USB-to-battery adapter to run them off mains power.

        I take a lot of calls on my phone (with airpods in). The camera is already quite good but what makes it a nice experience is to have the phone on a small tripod. That gives a much better perspective and I don’t have to keep worrying about my phone falling over.

        I think that generalizes webcams in that it’s nice to have them on a boom or be able to adjust the positioning in some other way.

        1. 3

          I use my phone and my video looks noticeably better than everyone else’s in meetings. If you have already bought into the apple ecosystem (mac + iphone) then “continuity camera” works great for running Zoom or whatever on your real computer while using the camera on your phone, in case you need to share windows during meetings.

          1. 1

            I use my phone and my video looks noticeably better than everyone else’s in meetings. If you have already bought into the apple ecosystem (mac + iphone) then “continuity camera” works great for running Zoom or whatever on your real computer while using the camera on your phone, in case you need to share windows during meetings.

            You can also use Camo if you have a Windows computer, to use a phone as your camera. Phone cameras are so much better than any webcam.

          2. 1

            I have this thing where my AirPods won’t connect to my work laptop, so I just accept it and dial into meetings via my phone.

            1. 3

              There is no technology more frustrating on a daily basis for me than Bluetooth.

              1. 1

                The bluetooth is fine but I think it’d make me log on to my work laptop with my personal iCloud and I refuse to do that (even if IT would let me).

                (This is seriously the best Bluetooth device I’ve ever used.)

      3. 2

        I’ll second the recommendation of a good chair. I’ve had a Herman Miller chair for seven years now, it’s still as new and has five more years of guarantee.

        It is a night and day difference to low-tier office chairs. That said, I echo what other’s have said of incorporating walks, moving around, and doing strength training.

    7. 1

      RSI-focused:

      • Nice mechanical keyboard you love to use.
      • Lightweight mouse.
      • One big screen at the right height, distance, brightness. Two can be fine too, though.
      • I’ve had a standing desk for the last three years, but I didn’t use it standing very much. At some point I (unrelatedly) developed circulation issues in my legs, so I have no future plans to use it standing either.
      • Chair you can configure so bad posture is uncomfortable.
      1. 3

        Mechanical keyswitches don’t have anything to do with RSI prevention.

        1. 2

          Yeah, I don’t think switch type usually has much effect on RSI compared to other things?

          For RSI prevention, you may want a split keyboard, and it’s very important to have it at a good height. For me, frequently switching between mouse and keyboard, or between home row and arrow keys, was bad. Learn what works for you, and if you start to have problems fix your setup EARLY, because it gets worse a lot faster than it gets better.

        2. 2

          I disagree. Having a keyswitch that provides tactile feedback before it reaches the end of its travel can be extremely helpful for preventing RSI.

        3. 1

          No they’re more for giving you a nice tactile feeling which I think the recent ‘creamy’ switches definitely do.

      2. 1

        What I found most helpful when I developed RSI was, in descending order of importance:

        • setting my desk and chair to the correct height – arms at 90°!
        • a split keyboard (formerly a Kinesis Advantage, now a MoErgo Glove80)
        • switching mousing hands and technologies

        Mostly, I found that varying my routine was super important.

    8. 3

      I do something very similar with Python, using pytest-postgresql; I stand up a brand new instance and then build the structures in it I need. I haven’t put this into CI/CD yet, because time, but I expect it to work fine there, too.

      1. 1

        I am thinking of setting the test runner up to try and use a templated version of our database running locally, first, rather than load the whole thing into the new DB instance at runtime, just to save time. using the -T flag to createdb might be a better approach.

    9. 5

      Jujutsu is and will remain a completely unusable project for me until it has support for pre-commit hooks, unfortunately. I enjoyed what I saw when I demoed it a bit to learn, but every repo I’ve worked with in my 12-year career has had at least one pre-commit hook in it (including personal projects which generally have 2-3!), and running them manually completely defeats the entire purpose of them especially in a professional setting.

      I keep checking in on the issue and the tracker for it, but still no luck.

      I think given that the original issue starts with

      I don’t think I had seen https://pre-commit.com/ until @chandlerc linked to it

      shows the completely different universe the devs of jj live in, and it leads me to believe that this feature just won’t get much priority because obviously they never use it, and all projects like this are (rightfully, usually) targeting fixing problems for the people making it.

      I have my hopes still, but until then…. back to git.

      1. 15

        I’m curious why people like pre-commit hooks. I run my makefile far more frequently than I commit, and it does the basic testing and linting. And the heavyweight checks done on the server after pushing. So there doesn’t seem much point to me in adding friction to a commit when the code has already been checked, and will be checked and reviewed again.

        1. 8

          To take an example from my jj-style patch-manipulation-heavy workflows:

          • I often I abort (“stash”) work and switch to a new speculative approach, which means committing it before it’s been checked. Indeed, at this stage, it’s not expected to pass checks, so triggering a pre-commit hook would be a waste of time.
          • I often significantly modify the graph structure (combine and split various commits, discard some, etc.), and then need to re-run checks on the whole patch stack/tree. Therefore, I rely primarily on post-hoc checks that run across all the commits in the stack/tree, rather than pre-commit checks.
          • I often want to re-run checks for each commit in my stack after rebasing on top of the latest main commit (in a trunk-based development workflow).

          One should definitely distinguish between “checks that should run on each commit” and “pre-commit checks”.

          • Most pre-commit checks represent checks that should run on each commit, but it doesn’t mean that before committing is the optimal time to run the check. That depends on one’s specific local workflows and how they use commits.
          • In your case, you are already running your checks before committing, and presumably you don’t modify the commits afterwards, so an actual pre-commit hook seems unlikely to help your workflows.
        2. 5

          I use pre-commit hooks extensively, to ensure that the code I’m pushing meets all kinds of project requirements. I use formatters, and linters, and check everything that can be checked. For one thing, it does away with the endless battles over meaningless nonsense, like where commas belong in SQL statements, or how many spaces to use for indenting. Another is it just reduces the load on the CI/CD systems, trading a small fraction of my time locally for expensive time we pay for by the cycle.

          I’ll never go without them again.

          ETA: but, based on sibling comments, it seems that the jj folks are On It, and yeah, it won’t be a “pre-commit” hook, the same way, but as long as it can be automagically run … ok, I’m in.

          1. 4

            As others here have stated, I think the fundamental issue is that commits in jj are essentially automatic every time you save. There are a few consequences to this such as:

            1. no predefined “significant event” to sensibly attach a pre-hook to (I’d make them pre-push hooks, maybe?)
            2. inability to save any secrets to disk as those immediately get committed, meaning they’re in the history, meaning you have to rewrite history locally to remove them before pushing, or Lord help you
          2. 3

            I care about these things too, but they’re tested in CI, once I am ready to integrate them, rather than locally.

            1. 2

              Horses for courses. I’d rather use my compute than The Cloud but I am notoriously a cranky greybeard.

              1. 3

                For sure, I’m just saying it is possible to care about those things and not use hooks. You should do what’s best for you, though.

          3. 2

            Aren’t those all already available in the ide? I get my red squiggles as I write, instead of waiting for either the precommit or the ci.

            1. 2

              Not everyone uses an IDE, or the same IDE, or the same settings in the IDE. I think that computers should automatically do what they can, and using pre-commit hooks (or whatever the jj equivalent will be) is a way to guarantee invariants.

        3. 4

          Pre-commit hooks are really easy to enforce across a large team, while any sort of IDE settings are not. Some I’ve used before:

          • Remove all graphical output from Jupyter notebooks, so that any files added to git remain small
          • Similarly, check that there are no large files being added, unless someone uses a option
          • Run the formatter on any changed code

          You can do all of these in other ways, but pre-commit makes it easy to do exactly the same thing across the entire team

          1. 8

            I totally agree with you all that stuff is super important to run before changes make it to the repo (or even a PR). The problem is that pre-commit hooks (with a lower case “p”) fundamentally don’t mesh with Jujutsu’s “everything is committed all the time model”. There’s no index or staging area or anything. As soon as you save a file, it’s already committed as far as Jujutsu is concerned. That means there’s no opportunity for a tool to insert itself and say “no, this commit shouldn’t go through”.

            The good news is that anything you can check in a pre-commit hook, works just as well in a pre-push hook, and that will work once the issues 3digitdev linked are fixed. In the meantime, I’ve made myself a shell alias that runs pre-commit -a && jj git push and that works well enough for me shrug

            1. 2

              Not everything runs that easily. Eg https://github.com/cycodehq/cycode-cli has a pre_commit command that is designed specifically to run the security scan before commit. It doesn’t work the same before push because at that point the index doesn’t contain the stuff you need to scan.

              1. 1

                Hm, I’m not sure if cycode-cli would work with Jujutsu in that case. I’m sure if there’s enough demand someone would figure out a way to get it working. Even now, I’ve seen people MacGyvering their own pre-commit hooks by abusing their $EDITOR variable.. E,g export EDITOR="cycode-cli && vim" ><

                1. 1

                  Many people never use an editor to write a commit message, they use the -m flag directly on the command line 🙂

                  But yeah, at that point we could mandate that we have to use a shell script wrapper for git commit that does any required checks.

        4. 2

          In my years I’ve had exactly one repo that used makefiles.

          The issue with makefiles has nothing to do with makefiles – the issue without pre-commit is deliberate action.

          If I’m on a team of 12 developers, and we have a linter and a formatter which must be run so we can maintain code styles/correctness, I am NOT going to hope that all 12 developers remembered to run the linter before they made their PR. Why? because I FORGET to do it all the time too. Nothing irritates me more and wastes more money than pushing to a repo, making a PR, and having it fail over and over until I remember to lint/format. Why bother with all of that? Setup pre-commit hooks once, and then nobody ever has to think about it ever again. Commit, push, PR, etc, add new developers, add new things to run – it’s all just handled.

          Can you solve this with a giant makefile that makes it so you have just one command to run before a PR? Yes. But that’s still a point of failure. A point of failure that could be avoided with existing tools that are genuinely trivial to setup (most linters/etc have pre-commit support and hooks pre-built for you!). Why let that point of failure stand?

          Edit: Also, keep in mind the two arent mutually exclusive. You like your makefile, fine keep it. Run it however many times you want before committing. But if everyone must follow this step at least once before they commit…..why wait for them to make a mistake? Just do it for them.

          1. 16

            Generally, I believe jj developers and users are in favor of the idea of defining and standardizing “checks” for each commit to a project, but the jj model doesn’t naturally lend itself to running them specifically at pre-commit time. The main problems with running hooks at literally pre-commit time:

            • Tends to pessimize and preclude the patch-manipulation workflows that jj is designed to facilitate.
              • In particular, an arbitrary commit is likely not to be in a working state, so the pre-commit hook would fail.
              • Running pre-commit hooks for each commit serially is often slower. There are ways to parallelize many formatting and linting tasks (for multiple simultaneous commits), but you can only do that once you’ve actually written all the commits that need to be validated.
            • Unclear how it would work with all the jj commands that do in-memory operations.
              • It’s not necessarily possible to run pre-commit hooks on those commits since they’ve never been materialized in a working copy.
              • For example, if you rebase commits from one place to another, those commits may no longer pass pre-commit checks. A lot of the technical and UX advantages of jj rely on “commits” always succeeding (including rebases and operations that may introduce merge conflicts); adding a way to make committing fail due to pre-commit hooks would sacrifice a lot of the benefits.
              • The current jj fix command can run on in-memory commits, I believe, but this also means that it’s limited in capability and doesn’t support arbitrary commands.

            The hook situation for jj is not fully decided, but I believe the most popular proposal is:

            • Support “pre-push” hooks that run immediately before pushing.
              • These can run the checks on each commit about to be pushed, and either fix them or reject the push as appropriate.
            • Rather than support “pre-commit” hooks, delegate the checks to commands like the following, which run post-commit, even on commits that aren’t the ones currently checked-out. Then the user can run these commands manually if they want, and the pre-push hook can invoke them automatically as well.
              • (present) jj fix: primarily for single-file formatters and linters
              • (future) jj run: can run arbitrary commands by provisioning a full working copy

            For the workflows you’ve specified, I think the above design would still work. You can standardize that your developers run certain checks and fixes before submitting code for review, but in a way that works in the jj model better, and might also have better throughput and latency considerations for critical workflows like making local commits.

              1. 4

                @arxanas is speaking from experience and you can see this in action today if you want: it’s built into git-branchless (which he built) as git test: https://github.com/arxanas/git-branchless/wiki/Command:-git-test

                git-branchless is a lovely, git-compatible ui like jj, inspired by mercurial but with lots of cool things done even better (like git test)

          2. 2

            Setup pre-commit hooks once, and then nobody ever has to think about it ever again.

            Once on every clone, across the entire team if a hook changes or is added?

            How do you manage hooks? Just document them in the top level readme?

            1. 5

              We had at least one conversation at a Mercurial developer event discussing how to make hg configs (settings, extensions, hooks, etc) distributed by the server so clones would get reasonable, opinionated behavior by default. (It’s something companies care about.)

              We could never figure out how to solve the trust/security issues. This feature is effectively a reverse RCE vulnerability. We thought we could allowlist certain settings. But the real value was in installing extensions. Without that, there was little interest. Plus if you care this much about controlling the client endpoint, you are likely a company already running software to manage client machines. So “just” hook into that.

            2. 3

              I’m not entirely sure I follow what you’re asking. My guess here is that you’re unfamiliar with pre-commit, so I’ll answer from that perspective. Sorry if I’m assuming wrong

              Pre-commit hooks aren’t some individualized separate thing. They are managed/installed inside the repo, and defined by a YAML file at the root of your repo. If you add a new one, it will get installed (once), then run the next time you run git commit by default.

              As long as you have pre-commit installed on your system, you can wipe away the repo folder and re-clone all you want, and nothing has to change on your end.

              If a new dev joins, all they have to do is clone, install pre-commit (single terminal command, run once), and then it just…works.

              If a pre-commit hook changes (spoiler: They basically never do…) or is added/removed, you just modify the YAML, make a PR to the repo, and its merged. Everyone gets latest master and boom, they have the hook.

              There is no ‘management’. No need to really document even (although you can). They should be silent and hidden and just run every commit, making it so nobody ever has to worry about them unless their code breaks the checks (linters, etc).

              1. 1

                Thank you.

                You’re talking about?:

                https://pre-commit.com/

                Not?:

                https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks

                I take it the former runs code on pre-commit for any cloned git repo (trusted or not) when installed locslly, while the latter needs setup after initially cloning to work?

                So it changes git behavior to run code pre-commit on every repo - but doesn’t directly execute code, rather parses a yaml file?

                1. 2

                  Yes I am talking about the former. To be clear though I don’t know exact internals of pre-commit, but I don’t THINK it modifies git behavior directly. Instead it just has a hook that runs prior to the actual git commit command executing, and it DOES execute code, but does so in a containerized way I believe. The YAML file is just a config file that acts sorta like a helm chart does; providing configuration values, what command to run – it’s different for each hook.

                  If you’re curious, you can see an example in one of my personal repos where I am defining a pre-commit hook to run the “Ruff” tool that does linting/formatting on my project.

                  Also, a note: Pre-commit hooks only execute against the files inside the commit too! So they tend to be quite performant. My ruff checks add maybe like….100ms to the runtime of the commit command. This can obviously vary – I’ve had some that take a few seconds, but in any case they never feel like they’re “blocking” you.

                  1. 1

                    FWIW I’d consider 100ms to be quite bad given that jj commits tend to take < 10ms.

                    I agree with folks in general that in jj, almost no commits made by developers will pass pre-commit checks so a rethinking/different model is required. (For most of my professional life I’ve just used a combination of IDE/LSP feedback and CI for this, a long with a local script or two to run. I also locally commit and even push a lot of broken/WIP code.)

                    I’m also curious how the developer experience is maintained with pre-commit hooks. I was on the source control at Meta for many years and there was an intense, data-driven focus on performance and success metrics for blocking operations like commit. Do most authors of pre-commit hooks bring a similar level of care to their work?

        5. 2

          There has been some interesting discussion in response to my question about pre-commit hooks.

          The thing I’m (still) curious about is how to choose the point in the workflow where the checks happen. From the replies, I get the impression that pre-commit hooks are popular in situations where there isn’t a build step, so there isn’t a hook point before developers run the code. But do programmers in these situations run the tests before commit? If so, why not run the lints as part of the tests? If the lints are part of the standard tests, then you can run the same tests in CI, which suggests to me that the pre-commit hook configuration is redundant.

          I’m asking these questions because pre-commit hooks imply something about the development workflow that I must be missing. How can you get to the point of pushing a branch that hasn’t gone through a build and test script? Why does it make sense to block a commit that no-one else will see, instead of making the developer’s local tests fail, and blocking the merge request because of the lint failure?

          For auto-formatting in particular, I don’t trust it to produce a good result. I would not like a pre-commit hook that reformats code and commits a modified version that I had not reviewed first. (The likelihood of weird reformatting depends somewhat on the formatter: clang-format and rustfmt sometimes make a mess and need to be persuaded to produce an acceptable layout with a few subtle rephrasings.)

          1. 3

            For auto-formatting you can choose to either have it silently fix things, or block the commit and error out. But yes, generally you can never rely on pre-commit hooks to prevent checks because there’s no way to enforce all developers use it. You need to rely on CI for that either way.

            The benefit to running the checks before pushing to a pull request for instance, is that it reduces notifications, ensures reviewers don’t waste time, reduces the feedback / fix loop, etc. Generally just increases development velocity.

            But all those benefits can also be achieved with pre-push. So I see the argument for running the checks prior to pushing to the remote. I fail to see the argument for running the checks prior to committing. Someone elsewhere in the thread pointed out a checker that apparently inherently needs to run pre-commit, so I guess there’s that? I don’t know the specifics of that checker, but seems like it’s poorly designed to me.

      2. 7

        I don’t think I had seen https://pre-commit.com/ until @chandlerc linked to it

        We always have one dev that introduces Husky to the repo and then everybody else has to add --no-verify to their commits because that’s easier than going into an argument with the kind of dev that would introduce Husky to the repo.

        Most devs underestimate (as in have absolutely no idea as to the amount of rigor necessary here) what goes into creating pre-commit checkers that are actually usable and improve the project without making the developer experience miserable.

      3. 4

        I mentioned this in another comment, but pre-commit (the hook) will never be feasible for the simple fact that “committing” isn’t something you explicitly do in Jujutsu. It’s quite a mindshift for sure, but it’s one of those things that once you get used to it, you wonder how you ever did things the old way.

        The good news is that pre-push is totally possible, and as you noted, there is work underway to make the integration points there nicer. AIUI the issues you linked are getting closer to completion (just don’t expect them being fixed to mean that you’ll have the ability to run pre-commit hooks).

      4. 1

        Find a rust capable dev, start hacking on it and engage with the Jujutsu crowd on IRC (#jujutsu at libera) or Discord. The developers are very approachable and can use additional PRs. That way everyone can eventually benefit from a completed hooks implementation.

    10. 9

      less is a decent pager available anywhere, but pspg is an excellent pager worth taking the time to install!

      1. 3

        Also pgcli for some niceties like autocomplete on join conditions.

      2. 1

        Whoa this is fantastic! Thanks!

    11. 5

      Normalize your data unless you have a good reason not to

      I think this is actually bad advice, normalization means migrations, painful binding, etc. It’s not a code-native way to operate.

      Postgres has JSON support and computed columns, so you can make indexes on JSON data as needed.

      I think you can have a different table for different items, but I like using a jsonb data column, and having the ID hierarchy as columns. Anything else gets handled by binding the JSON in one step, and I can put defaults as I need.

      MUCH more friendly to work with, and to extend, esp in languages that have actually good JSON support (Go, Rust) with tags for extensibility (JS is too primitive, zod fixes a lot of it though)

      1. 56

        So you are treating PG as a document store? You should look up the downsides of document stores before you keep going down this road.

        JSONB has its place in PG and it’s very useful for irregular data, like extra user defined fields, etc. But for core functionality of your data you are 90% of the time better off using normal PG data types like text, int, etc.

        One of the best things about normal datatypes outside of JSONB, is you get data correctness. You can use constraints like CHECK that verify the data before you store it. This is extremely useful for data correctness outside of application code.

        Doing that in JSON is miserable. Doing that in the application means you will have lots of bad data eventually, as bugs in application code inevitably happens.

        In fact for the OP, the biggest thing I’d add to their list, is constraints. Make the DB keep your data reasonable. This data relates to this data, setup a REFERENCES, etc.

        Sometimes document stores are the right tool for the job, but it’s rare in my experience.

        1. 6

          Call me a heretic, but:

          • FKs are inefficient for the DB to manage, code can enforce this if you truly need it, no need to make the DB slower
          • If you need constraints, check them in code, don’t make the DB do it. It’s not miserable in code, it’s clear in code because you don’t have to refer to a schema to see what you’re allowed to do.

          I’m not treating it as a document store, I’m just managing the schema in code, not in the DB.

          1. 23

            Another counterpoint that wasn’t mentioned: while not universally true, databases often outlive projects built on top, at which point you would be much happier with uniform, structured data, than a random data dump.

            1. 2

              You end up massively annoyed either way because the data model in your new application is not going to line up with the data model in the database no matter what you do.

              1. 2

                That’s why you design your data model for the first application as well, and not just organically grow it, ideally.

          2. 19
            • FKs are inefficient for the DB to manage, code can enforce this if you truly need it, no need to make the DB slower

            They should be fast, there ought to be an index. Also, how is your code going to enforce that there are no dangling foreign keys when you do a DELETE on the table being pointed to?

            • If you need constraints, check them in code, don’t make the DB do it. It’s not miserable in code, it’s clear in code because you don’t have to refer to a schema to see what you’re allowed to do.

            It’s not miserable in the database either. In fact, you have to add a CHECK constraint or unique index only once and the db will do everything to guarantee it is enforce. In code you have to do it everywhere you write to the table. For bulk updates in code this is extra painful to ensure, especially if it’s a multi-column constraint and the bulk update affects only one of the columns involved in a CHECK constraint.

            C’mon man, you’re setting yourself up for failure by working against the grain of the RDBMS. As others have pointed out, you’re probably better off with a document store.

            1. 1

              Also, how is your code going to enforce that there are no dangling foreign keys when you do a DELETE on the table being pointed to

              Transactions, and transactional functions (the way foundationdb encourages).

              Instead of slinging sql all over my codebase, I just make reusable functions. Analgous to components in react. You can also use workflow-style systems for async deletion of large relations.

              1. 6

                Transactions, and transactional functions (the way foundationdb encourages).

                Transactions don’t, by themselves, ensure that there will be no dangling foreign keys. It is only a tool you can use to make sure records that belong together are deleted atomically, when viewed from the outside. The code you run in your transactions would have to be flawless.

                I can’t count the number of times where I thought some code was correct, but then a FK constraint fired, letting me know I had forgotten to delete (or clear the FK in) some relation or other. Even moreso with CHECK constraints across multiple columns. You give up all of that by treating your RDBMS as a dumb document store.

                Also, bulk updates and deletes are something your database can do very quickly, which means you won’t be running that meticulously crafted code which ensures the constraints are upheld. Constraints will be upheld by the database, though. Deleting or updating records one-at-a-time is of course possible but not exactly efficient.

                And finally, over time you might want to add other clients (maybe as part of a rewrite? or perhaps you have to run a one-off query by hand) that speak to the same database. They will have to have their code as meticulously crafted and agreeing 100% with your original code. If you have the constraints in the database, that’s not as big a risk.

                1. 2

                  We seem to be on different grounds: I’m a dev first, you seem to be a DBA first. I don’t think there’s any way I can convince somone under another “religion” for lack of a better term.

                  1. 8

                    I’m a dev and not a DBA, but I know that the database can enforce these rules better than I can (and the query optimizer get better results because of that). It’s the same way I trust the compiler to make safety decisions better than I can. I can forget a free the same way I can forget to maintain referential integrity myself.

                  2. 6

                    Hm, that’s an odd thing to say. I’m a developer by interest and trade. I actually learned to use databases “on the job”, as I barely avoided flunking my university database course because my grasp of SQL was too weak. In the intervening years I was forced to learn SQL, because it’s such an important part of most web stacks, to the point where I’m typically the go-to expert on Postgres in whatever company I work for.

                    My early experiences have been with MySQL with all its problems of wanton data destruction (back when it defaulted to MyISAM) and Ruby on Rails which didn’t/couldn’t even declare foreign keys or constraints back in the day. I’ve felt the pain of the “enforce everything in code” approach firsthand when that was the accepted wisdom in the Rails community (circa 2007). I’ve seen systems grind down with the EAV approach and also experienced how difficult it can be to work with data that’s unnecessarily put in JSON.

                    All of this to say, I have the battle scars to show for my hard gained wisdom, I’m not just repeating what I hear here and elsewhere. I’ve learned to trust the database to enforce my data invariants and constraints, as I know we will get it wrong in code at least at some point. It’s an additional and important safety net.

                    1. 4

                      This is almost exactly where I’m at. I’ve used big non relational datastores before, and they can do really cool stuff. But I’m really quite a Fred Brooks acolyte at the end of the day, and I always try to build systems such that “show me your tables” is more important than “show me your flowcharts”. I find the exercise of really thinking about the entities I need to manipulate up front pays off hundredfold.

          3. 11

            Anyone trying to understand your database can’t without also reading your code, since there is nothing in the DB to document how relationships exist and/or are constrained in the data.

            I like how your assumption is, that you can write faster code than a DB that has been at it for likely decades. I’m not saying it’s impossible, but improbable. Unless you spend lots of time optimizing that in the app layer. Why would you do this to yourself?

            As far as constraints in the code, you want them too(for UX), but:

            • Data needs to be trusted. Databases with proper constraints is the lowest common denominator.
            • Data documentation. You hand a new person read access on the DB, they can print out ER diagrams and easily see the relationships around the data. Why re-invent the wheel where none of these tools exist?
            • External to your code data manipulation is now misery for everyone including you.
            1. 16

              I like how your assumption is, that you can write faster code than a DB that has been at it for likely decades.

              As somebody who’s DBA’d for devs with similar mindsets, in my experience this specific antipattern is due to selection bias.

              That is, the database may be running thousands of different queries and query plans each day, but the only query plans that come to a dev’s attention are the ones that seem slow. So those get examined closely, and sometimes, something is “obviously” wrong with the plan–if only the DB would join here and loop there, this would be faster!

              After a few of those, the dev in question starts to wonder why all the plans they’re seeing are bad. Surely, they can do better than that, so why do we have so much written into the DB, where it generates all of these bad plans. Once they’ve hit that point, it’s not far to the idea that all query plans should be implemented in the app code, so they can be reviewed, optimized, etc. A few months later they’re installing MongoDB and reinventing joins, because they didn’t consider the other thousands of queries in the first place.

              1. 2

                Well said. One should strive to understand the tools one uses.

            2. 3

              I understand your perspective, and I appreciate you actually asking questions instead of abusing my thoughts like others in this thread (which is shockingly exhasuting).

              I think there are 2 schools of thought:

              1. Every service talks to the same data store (tables, schema, etc.)
              2. Every service has their own data store (only can access user info via the user service)

              #1 is much more traditional (and probably represents the majority of the people fleecing me in this thread)

              is much closer to home for me, where I know only my code touches the DB, I can make a LOT of assumptions about data safety and integrity. Additionally, when you’re here, you find it’s a lot more scalable to let your stateless services do the CPU work than your DB. I’ve saturated m5.24xl RDS instances before. Nightmare scaling event.
              1. 3

                In general what I have found is that #2 is only true for some period of time, assuming it’s data people actually care about. They inevitably want access to it here, there and everywhere.

                If #2 is your wheelhouse and nobody else will ever care about the data: why are you storing to PG? Why not store to files in the filesystem or berkely db or something else that doesn’t have the overhead that PG has.

                I can make a LOT of assumptions about data safety and integrity.

                Assuming those assumptions are correct. Many people screws these up, all the time. I know I screw up assumptions all the time.

                I’ve saturated m5.24xl RDS instances before.

                I had to go look up what that was. Our test hardware is larger than that. Cloud DB’s are ridiculously over-priced and perform badly in my experience. You might consider going to dedicated hardware. You can always pay an MSP to babysit it for you. Even after their high markup, it should still be significantly cheaper than RDS.

                1. 3

                  Why not store to files in the filesystem or berkely db or something else that doesn’t have the overhead that PG has.

                  You know why: The tooling, the support, available offerings for a managed solution with monitoring and backup.

                  Cloud DBs are overprice for a hardware/perf cost perspective, but I’d argue not from TCO except for either extreme scale, or niche workloads.

            3. 1

              Actually, it’s the filesystem or rather the kernel that is the lowest common denominator.

              Now you might say “sure, but you don’t interact with the kernel, you interact with the database”. My response would be: “that is only by convention and it is exactly the same for consumers of the data - they should talk to my app(s) instead of talking to the database directly”.

              Now, if you have multiple applications with different owners then that doesn’t work. But let’s not pretend that using a single database in that style would necessarily be the best option. There is a reason why microservices came up. And I’m against microservices (instead I’m for just “services”). But the point is that this style was “invited” because of the pain of sharing the same data. That’s how the shared nothing architecture came into life.

              And if you think about it, this is how many many businesses operate, be it in technical or organizational terms.

              The gist is: normalization also disadvantages, so I’m on the side of danthegoodman. This quote:

              Normalize your data unless you have a good reason not to

              Is similar to saying “your architecture should be a monolith unless” or “your architecture should be microservices unless”. Instead, it should be changed into:

              Pick the optimal level normalization of your data

              And of course now it’s difficult to say how. But that’s the point: it’s difficult and people here seem to be of the opinion that normalization should be “the standard”. And it shouldn’t.

              1. 1

                I’m obviously of the opinion that you should default to normalizing the data and only de-normalize when it makes sense to do so.

                1. 1

                  Right. And I’m challenging this opinion.

                  1. 2

                    Let’s think about it another way.

                    Your perspective is(paraphrasing, and please correct if I’m wrong), data integrity should be done in code, because you trust your code more and it’s easier for you.

                    I’d argue, ideally we should do it in both places, so that if there is a bug in one, the other will (hopefully) catch the problem. That is, if we care about getting the data correct and keeping it correct. Much like in security land, where we talk about defense in depth. This is data integrity in depth.

                    I’ve seen terrible code and terrible databases. I’ve even seen them both terrible in the same project. When that happens I run for the hills and want nothing to do with the project, as inevitably disaster strikes, unless everyone is on board with fixing both, with a clear, strong, path forward.

                    Normalization at the database layer helps ensure data integrity/correctness.

          4. 9

            FKs are inefficient for the DB to manage, code can enforce this if you truly need it, no need to make the DB slower

            I would be very surprised if typical application code handled this properly and wasn’t racy.

            And how would the application do it, in a faster way than the database?

            1. 3

              Indeed. Not entirely on point, but reading through Jepson.io’s testing shows that getting data stores wrong is very common. Even PG didn’t come out completely unscathed last time it was tested: https://jepsen.io/analyses/postgresql-12.3

            2. 2

              I actually don’t find this hard, and I’ve done this for financial systems doing 3k txn/s. The faster way is based on data modeling: Relieve the DB of the work, and allow your code to aid transactional guarantees. That’s pretty much the standard way to build systems that scale beyond the single postgres instance

              1. 2

                Can you be more specific? How do you make sure something you reference with an FK does not disappear while you insert?

                1. 1

                  I normally structure data so this is not an issue, but you can use serializable transactions and select the row

          5. 5

            What part of managing foreign keys do you think the DB is inefficient at? How would you do this more efficiently in application code?

            1. 2

              My relieving the DB of the computing. For example, if you delete a row that has 1M relations to it, your DB is going to have a really bad time.

        2. 5

          You probably know this, but for what it’s worth:

          • you can create indexes and CHECK and EXCLUDE constraints on JSONB data in postgres and they work fine
          • you can’t define UNIQUE, NOT NULL or REFERENCES constraints, but you can simulate them (with EXCLUDE for UNIQUE, CHECK for NOT NULL and REFERENCES can be done with generated columns (if each object has only some small number of references to each foreign table) or with triggers (if there are many foreign keys for a single foreign table).

          I think unique and not null are simple enough, but if you find yourself needing to write TRIGGERs then maybe it would be easier to explode out that schema.

          Some demos…

          Create a table that requires inserted json to have a unique numeric id:

          db> create temp table t (
              j jsonb check (
                  (j->'id' is not null and jsonb_typeof(j->'id') = 'number')
              ),
              exclude ((j->'id') with =)
          );
          CREATE TABLE
          

          Demonstrating that the constraints work correctly:

          db> insert into t values ('{"id": 1}'), ('{"id": "1"}')
          new row for relation "t" violates check constraint "t_j_check"
          DETAIL:  Failing row contains ({"id": "1"}).
          
          db> insert into t values ('{"id": 1}'), ('{"id": 1}')
          conflicting key value violates exclusion constraint "t_expr_excl"
          DETAIL:  Key ((j -> 'id'::text))=(1) conflicts with existing key ((j -> 'id'::text))=(1).
          
          db> insert into t values ('{"id": 1}'), ('{"id": 2}')
          INSERT 0 2
          

          Demonstrating that queries use an index:

          db> explain select * from t where j->'id' = '1'::jsonb;
          +--------------------------------------------------------------------------+
          | QUERY PLAN                                                               |
          |--------------------------------------------------------------------------|
          | Bitmap Heap Scan on t  (cost=4.21..14.37 rows=7 width=32)                |
          |   Recheck Cond: ((j -> 'id'::text) = '1'::jsonb)                         |
          |   ->  Bitmap Index Scan on t_expr_excl  (cost=0.00..4.21 rows=7 width=0) |
          |         Index Cond: ((j -> 'id'::text) = '1'::jsonb)                     |
          +--------------------------------------------------------------------------+
          EXPLAIN
          

          There’s also a default GIN operator class for jsonb supports queries with the key-exists operators ?, ?| and ?&, the containment operator @>, and the jsonpath match operators @? and @@.

          create index t_idx_j_gin on t using GIN (j);
          explain select * from t where j @> '{"name": "Alice"}'::jsonb;
          

          Use a generated column to enforce a REFERENCES NOT NULL constraint without having to add triggers:

          create temp table clients (id serial primary key);
          create temp table orders (
              data jsonb,
              client integer generated always as (
                  (data->'client')::integer
              ) stored references clients not null
          );
          
          1. 2

            Yes, I agree it’s mostly possible now, but it’s not fun.

        3. 1

          Disagree. If only your application uses the database then just create a dedicated type that describes the format/shape of the json. To introduce a bug, the type would have to be changed in the wrong way (but then that could also happen with a change to the table schema).

          If, instead, you don’t have a dedicated type that defines the json shape or, worse, use a PL without statically enforced types, then yeah, use the database types instead. But that’s on you, not on the concept.

          1. 25

            Database normalization isn’t about type safety though—it’s about data contradiction.

            1. 1

              To do that it’s sufficient to make those parts into their own columns. That’s not general argument against JSON. Please mind that OP specifically gave CHECK as an example.

              1. 4

                Of course it’s not sufficient and of course it is an argument against serialized blobs as a database. If you browse a database for the first time, it is self explaining which data is mandatory, it’s types, which relates to which, cardinality, length, etc. These things are clearly stated and the intent is non ambiguous.

                How you do that if you arbitrarily just serialized your data and store it somewhere? You have to check the code. But ‘the code’ is a diffuse concept. What part of the code? What if one part contradicts each other (which is what happens all the time and why a well designed data model is important)?

                I will repeat myself from other post. Data normalization is not a trend not a preference. People do it because they need the value it provides.

                1. 1

                  If you browse a database for the first time, it is self explaining which data is mandatory, it’s types, which relates to which, cardinality, length, etc. These things are clearly stated and the intent is non ambiguous.

                  Or you end up with lots of columns that are mostly null because of how you try to emulate sum-types (tagged unions). Then just having a look at the json is much easier.

                  How you do that if you arbitrarily just serialized your data and store it somewhere? You have to check the code. But ‘the code’ is a diffuse concept.

                  I partly agree with that. If you wanna go the full way, then you can also do a CHECK on the json schema. Then you don’t need to look at the code.

                  But try to see it from this perspective: databases and their technology evolves very slowly (which is not necessarily bad, but that’s how it is). Programming languages can evolve much faster. Therefore code can be much more concise and succinct compared to database schemas.

                  Data normalization is not a trend not a preference. People do it because they need the value it provides.

                  Yes, and the same can be said about data denormalization as well.

                  1. 3

                    Or you end up with lots of columns that are mostly null

                    That is what happens when you do not normalize your database. Normalization does address and solves this problem specifically. This was one of many reasons why relational theory was developed 5 decades ago.

                    databases and their technology evolves very slowly

                    That is blanket and abstract statement based on assumptions that haven’t been asserted or checked. Developments in the postgres ecosystem have Ben explosive in the last 5-10 years. And then there are news kids in town such as duckdb or click house just to name a couple. You would be surprised if you dig a bit deeper.

                    But my main point about that statement isn’t even that. Theory doesn’t evolve because it’s aximoatic/factual. The gains of removing redundancy, the algorithmic complexity of lookup operations of known data structures, have been known for decades and do work as expected. If you transform a linear lookup in a logarithmic one, that is always going to be a usefully tool. It doesn’t need to evolve because it does what’s expected and what people need.

                    Yes, and the same can be said about data denormalization as well.

                    But you are not talking about denormalization. You’re talking about completely giving up on the idea of using a relational database to enforce of at least aid in data integrity and roll your own integrity checks manually on your program. Not that it can’t be done for applications with small domains. it was the norm in desktop applications and worked well. But as soon as you start working with used accounts, I can’t even reason about where and how I would manage the data without an RDBMS.

          2. 20

            Every single database I’ve worked with in my 30+ year career ends up being queried and manipulated by multiple readers and writers. Throwing away database consistency guarantees because “this time it won’t” would be, in my opinion, a crazy move.

            1. 1

              As long as all those readers and writers are under the control of the same “owner” that is not a problem for all practical purposes.

              But in any case, if those readers/writer are really totally separate (and not just instances of the same service for example) then I think reading/writing to the same table is nowadays often considered an antipattern.

              1. 8

                They’re very likely to be in different languages in different environments. If you’re not using the DB as a central source of truth I suppose you can get away with it but I would be deeply wary of any such systems design.

                1. 2

                  More and more systems don’t have “the” db as a central source of truth. Not saying if that’s good or bad but that’s the reality.

                  1. 4

                    This just means those systems get to re-learn all the lessons we already learned the hard way. Sad for them. Hopefully it won’t take them quite as long as it took the first time around to learn all those lessons. I hope they at least read through Jepsen’s tests before going down that road: https://jepsen.io/analyses

                    Maybe something good will come of it?

                    SQL databases are not the perfect solution to the problem, and certainly improvements can and do continue to happen, but unless you are specifically tackling data problems as part of your product, it’s likely to end in misery for everyone involved.

                    Maybe whatever comes out of these experiments will provide good things. They got us JSONB in PG for instance, which I think is a great example of what the document store path gave us, that’s very useful.

                    Now AI LLM storage is the hot new thing. I expect most of this data storage could end up in a PG database with great results eventually, once the problem is well understood(provided LLM’s stick around a long time as useful things).

                    1. 1

                      I rather think that requirements have changed. It was totally normal to do “site maintenance” once a month for hours or even a day. Same for having downtimes sometimes.

                      Nowadays, a downtime in the credit card system means (or can mean) you can’t pay. Might not have been a problem in the past where magnet stripes were used or people just paid with cache, but nowadays it is. So systems have evolved to embrace eventual consistency and patterns like CQRS.

                      I feel like people here have never really seriously used JSON in postgres. Or, they solely look through it with the eyes of a DBA. Sure, then JSON is less convenient. But it’s not only the DBA that matters. Speed in development matters a lot. JSON can heavily increase that speed. And I believe that it’s not only perfectly fine but actually the best choice in most cases to go with a combination of JSON and “regular” columns.

                      For instance in my projects I often have a table like that: id, tenant, type, jsonbdata, last_modified, created_at, created_by, deleted_at, …

                      If I feel the need to always access specific parts of my json in my queries then I’ll factor it out. But it actually rarely happens, except for few “core” parts of the application. But most of the time the approach above works perfectly fine.

                      My impression is hence that people are really religious and forgot that software engineering is often about trade-offs.

                      1. 2

                        I agree in the broadest of senses. It’s often about trade-offs, and developer speed is one of those trade-offs. What I don’t get is how using JSOB is “faster”.

                        I use JSONB columns, but I keep it to things I know I don’t care about, like extra user defined fields. I don’t find developer speed faster by doing it. It just makes it easier for me and the end user. If they want to keep track of 50 different things, I don’t have to generate 50 user_field_x columns for them, and they can come labelled in the UI without any extra work. Inevitably they have typos in their json fields, and request automated cleanup. It’s annoying to have to go clean that up for them.

                        How exactly is it faster, from your perspective?

                        It sounds like maybe you are hinting that with JSONB columns you can keep your downtime down? Except I don’t see how? So you don’t have to run the ALTER TABLE statement? It’s not like that has to cause downtime anymore.

                        JSON can heavily increase that speed.

                        I guess this is the crux of my question: how does it do that?

                        1. 1

                          I agree in the broadest of senses. It’s often about trade-offs, and developer speed is one of those trade-offs. What I don’t get is how using JSOB is “faster”. (…) How exactly is it faster, from your perspective?

                          For a very simple reason: I can leave the database as is and can just make a code change. And I can test all of that that with very simple unit tests.

                          If I change the schema, things get much more problematic.

                          • I have to coordinate schema changes / migrations with other people.
                          • I need to ensure that both old and new schema work with both new and old application version at the same time (I have to do the same with json, but that’s a matter of in-language data conversion from json v1 to json v2).
                          • I need to be mindful about how the migration might impact performance (table locks etc)
                          • In particular when using sumtypes, I can make simple code changes and don’t have to touch the database at all
                          • Data validation is simplified. I parse my json into my format-type and from there into my domain type. The former is super easy. If I use postgres features, I now have to usually do some kind of mapping myself. E.g. how do I even store a map of, say, text -> timestamp in a column? hstore can’t. So back to json we are. And no, I don’t want to normalize it, let’s assume that this is 100% clear to never be normalized. I just want to store it and the frontend does something with it. No need for anything else.
                          • I could go on…

                          You can say “it’s all not a problem to me” and then we just have different experiences. But I know both worlds (from when postgres had no json types at all) and I prefer the json world by far.

                          1. 4

                            I have to coordinate schema changes / migrations with other people.

                            If others are accessing the db and relying on your implicit schema, you’d still have to at least tell them “there’s now a v2 available which has these properties”.

                            I need to ensure that both old and new schema work with both new and old application version at the same time (I have to do the same with json, but that’s a matter of in-language data conversion from json v1 to json v2).

                            How does that work when you add a new key to your json? You just keep the old json documents without that key? Doesn’t this make for very awkward querying where you have to code fallbacks for all the older versions of the json?

                            1. 2

                              If others are accessing the db and relying on your implicit schema, you’d still have to at least tell them “there’s now a v2 available which has these properties”.

                              I’m talking about a single application and team. We usually manage schemas via migrations (e.g. flyway or similar) and now it depends on when someone checks in their changes. Using types in my code, I’d get a mergeconflict instead.

                              This maybe would not be the case if I use some kind of “auto-migration” tool where I describe what I want to have instead of what I want to do but that brings other problems like not knowing how the toll will update the table etc - it’s much more risky.

                              How does that work when you add a new key to your json? You just keep the old json documents without that key? Doesn’t this make for very awkward querying where you have to code fallbacks for all the older versions of the json?

                              Awkward is to have to update/change every row everytime something like that happens.

                              I prefer the natural approach that humans use all the time: understand that there are multiple versions and then interprete them accordingly.

                              pseudocode:

                              enum DatabaseFormat:
                                 type DatabaseFormatV1 = {..., version: v1}
                                 type DatabaseFormatV2 = {... version: v2}
                                 type DatabaseFormatV3 = {... version: v3}
                              
                              type Foo = {...}
                              
                              function readFromDbFormat:
                                input: format of DatabaseFormat
                                output: Foo or Error
                                definition: format.version match {
                                   if v1 then ...
                                   if v2 then ...
                                   if v3 then ...
                                   else: error, unknown version
                                }
                              

                              In a good language, the boilerplate for that is minimal and the typesafety high. I much much much prefer this approach over anything I’ve used over the years.

                              Benefits:

                              • It is very clear from the code which formats there are and how the look like
                              • It is clear from the code how the data of each format is interpreted
                              • Git can be used to see when formats got created, why and so on
                              • Very easy to test conversion logic as unit test
                              • Very easy to roll back application code OR data OR even both at the same time
                              • No conversion / database maintenance needed
                              • Migrating data (from older to new versions) can happen over a longer time period which reduces load
                              • When two people attempt to make a change at the same time, they’ll get a merge conflict

                              Disadvantages:

                              • have to store an additional version column (and small performance overhead reading it)
                              • old versions can stay for a long time if they never get changed at some point. This means that manual SQL (e.g. for maintenance) now has to work with multiple versions as well. (though it is possible to just read all “old” formats and update them once in a while via code if necessary)
                              1. 2

                                In a good language, the boilerplate for that is minimal and the typesafety high

                                Sure, but that only deals with reading record-at-a-time. I was referring to queries. Like for example, constructing a WHERE condition to retrieve the records in the first place (or do bulk updates/deletes). Most applications have tons of situations where you need to query the database, not just fetching a single record by id.

                                1. 1

                                  Sure, but that only deals with reading record-at-a-time. I was referring to queries. Like for example, constructing a WHERE condition to retrieve the records in the first place (or do bulk updates/deletes). Most applications have tons of situations where you need to query the database, not just fetching a single record by id.

                                  Usually the WHERE condition aims at things like the “type”, the “tenant” or other normalized columns, not the json data. an occasional where against the json is fine too (of course all versions have to be considered then, which is one of the drawbacks I mentionied). If the data is queried in very complex ways all the time, I would not use jsonb.

                                  Let me give a concrete example: user settings. For example, does the user want to receive marketing mails. We can model that as a boolean column or a boolean in json.

                                  Usually the application will just deal with the settings of a specific user. Maybe once in a while I want to go over all settings or the settings of a specific group of users, e.g. for a batch job to do something. Well, then I can still just load them all at once and filter those that have the flag set to false in my code.

                                  Is that less efficient than having the database filter it and return only the ids of the users that I care about. Sure. Does it really matter? For many tables it will never matter.

                                  Let’s get one thing clear: I’m not arguing to ALWAYS use json. I never did that. It is you (and others) arguing against NEVER using using json. Well, basically never.

                                  I started using json with postgres from when jsonb was introduced, that is more than a decade of experience. And I can usually tell quite well, if a table benefits from being totally normalized from the beginning or not. If I’m wrong, I have to convert it once, no big deal. In the other cases I save a lot of time for the reasons I explained in detail.

                                  Maybe we just deal with very different applications if ALL your tables ALWAYS need to be normalized (from the beginning). That’s not the case of the applications I work on.

                                  1. 3

                                    It is you (and others) arguing against NEVER using using json. Well, basically never.

                                    Not at all. I’m actually a proponent of using the rich types Postgres offers. I have been known to store e.g. arrays of enums and sometimes jsonb columns as well. Both of these do quite well in cases like you mentioned where you store a grab bag of properties, or for storing additional “origin data” for auditing or debugging purposes.

                                    But I am extremely sceptical of using jsonb for nearly everything, which some people here seem to argue in favor of.

                          2. 2

                            I guess we can say: It’s not been a problem for me.

                            If I fully own the DB, then I can do whatever I want with it(basically negating the first 3 you mentioned). If I don’t fully own the DB, I have to coordinate with other people either way, as they will want access to the JSON data too. Saying, well it’s just JSON isn’t very helpful. They need to know this value needs to be in this format, or is a currency from this list or whatever still.

                            Around data validation: If you don’t fully own the DB, i.e. other people also access the data, then it’s all the more reason to make the DB handle all of that for you, as then the DB can say hey, no way you can’t put that there! Otherwise your application has to deal with it after the fact, which is very, very annoying. If you erase the data as it doesn’t fit your model, they get mad. If you re-format it, they get mad, etc.

                            I agree sometimes the mapping abilities between DB columns and internal language types can be annoying. Depends on the language and library you use. If you are stuck with crappy tools, I can see this as being true.

                            Around simple unit tests, we do simple unit tests on the DB itself too.

                  2. 2

                    This is a) very true and b) deeply unfortunate.

      2. 25

        What you’re saying is extremely uncommon and goes against almost all conventional RDBMS wisdom. Pretty much no one uses Postgres in the way you’re describing.

        1. 3

          I’ve heard it before, albeit from software developers/architects, not from a DB centric view. While it’s not the “proper” way to do things, it might be a better architectural choice when you have many small services and want to iterate quickly. Once the data model is clear, I would prefer a well-structured database, but until then, I consider using JSON a valid tradeoff.

          1. 20

            Iterating quickly with data will bite you later. Data is the foundation of most applications. Basically, the lower in the application you allow invalid data, the more layers above need to handle it.

            It may take a few years to notice and the secondary effects of broken systems and processes start eating at profitability. You permanently lose customers because you can’t fix their accounts. You can’t add new features. You can’t scale because the cost per customer is too high.

            This stuff piles on and I’ve seen it strangle companies. But if you just move on in 2 years, you never have to pay the price of your own mistakes.

            1. 22

              Move fast and get a new job before anyone realizes the trail of catastrophe you’ve left behind.

              1. 1

                Just out of curiosity, would you prefer an entity-attribute-value style? What is to stop people from stuffing json text in the attribute value?

                https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

                1. 4

                  Nothing other than their own good judgement.

            2. 2

              Yeah, at some point you have to transition to something more structured, but funding the right point to transition is difficult.

          2. 7

            My beef isn’t with JSON at all; it’s with the idea that you can just code your way out of bad data modelling decisions. I mean, sure, software is infinitely plastic, but it is deeply unpleasant to have to try and understand the foundational abstractions you’re building on when there’s no assistance to the developers from the tooling. I have JSONB columns on several tables I own, and they’re great, and Postgres’ affordances for dealing with JSON data are fantastic. But the data exists in a model and that model is at least partly defined by the constraints and restrictions that the database provides.

            1. 3

              that you can just code your way out of bad data modelling decisions

              I mean you can do this if we’re talking about hundreds of records. At scale though, you have no chance whatsoever.

        2. 1

          Yeah but you can say the same thing about Figma, Amazon, Notion, etc. and I’d hardly call them out for how they manually shard postgres instances. At some point you’ve got to stop using it the way it was intended, and start using it the way you need to get the job done.

          1. 3

            I’m using @zie ’s phrasing below.

            The more “conservative” approach others talk about should probably be the default for 95% of all use cases. But in the rare case that you actually „are specifically tackling data problems as part of your product“, you are right that sometimes a traditional DB setup may not work. And it really only comes up at the truly Big Data category, what most people consider as “big” data are nothing for a single PG instance, and hardware (vertical) scaling will probably be all most projects ever need, if anything.

            So I do think you are right, but it might still be bad advice for almost everyone.

            1. 3

              You put it fair and well (but I would probably disagree on where postgres breaks down for big data if we discussed further).

              Sometimes I forget that I’m probably a lot farther in my DB journey and can safely make some more questionable decisions in favor of speed and extensibility than others would be comfortable with.

              1. 3

                Please do share these, in general I think this forum is very good at letting differing voices all be heard! Just maybe add a note to make it clearer that it may not be intended for that CRUD backend webshop that gets 2 orders a week :D

                1. 3

                  my favorite recent one is manipulating KV ordering to follow typical directory structure (I call it “hierarchical ordering”):

                  Keys in hierarchical order:
                    "\x01dir\x00"
                    "\x01dir2\x00"
                    "\x01\xffdir\x00\x01a\x00"
                    "\x01\xffdir\x00\x01b\x00"
                    "\x01\xffdir\x00\x01🚨\x00"
                    "\x01\xffdir\x00\x01\xffa\x00\x011\x00"
                  

                  Maps to

                  /dir/
                  /dir2/
                  /dir/a
                  /dir/b
                  /dir/🚨 <- me testing utf8
                  /dir/a/1
                  

                  for a kv backing a distributed file system :) You can see the pattern a bit, but i plan on writing abt it soon

      3. 14

        I am maybe one of the few people that actually has production experience with a database schema like this built on postgres.

        Do not do this. Set aside the consistency and correctness issues and just assume the programmer can maintain those guarantees themselves (they almost certainly cannot, but other commenters have argued that point better). Even ignoring those issues, you’ve just kneecapped your performance. Computed indices/columns are not good enough. Every update needs to hit every computed index. It may not update the index, but it still has to check.

        You might think that that won’t be an issue. Maybe you’re right—but if you are then nobody is using your software at scale so you can do whatever you want. At the job where i had to deal with this kind of schema, the first large customer we got immediately ran into major query performance issues that were not fixable without huge refactoring of both the codebase and the database schema.

        One thing you’re glossing over with your “code native” talk is that this approach ties your in-memory structures to at-rest representation. The moment that you need to break that (for example: to make queries over at-rest data faster), you better hope your codebase is typed well enough to refactor. Our mid-sized clojure codebase certainly wasn’t.

        Do not do this. This is bad advice. Don’t delete your comment. Let it stand as a warning to others.

        1. 1

          I’ve yet to need a computed column, contrary to what this thread might suggest, I think I’m pretty good at modeling data

      4. 13

        I think you’re just trolling

        1. 3

          I think I get punished every time I share an opinion or fact contrary to what database fanatics believe

      5. 11

        I don’t want to be condescending, but this makes all alarm bells glowing red of overheating. I guess I might kindly ask to cut me some slack because of being relatively old and grumpy. Having worked in dozens of software projects professionally, I can say without blinking that that mindset is literally the #1 reason why software products and even companies fail. I can’t think of any other reason that comes close to this. I estimate that it people knew relational theory to a professional level and applied it at their daily work, effort in solving bugs and other problems would reduce by 80% or so.

        Hear me out, I don’t intend to make a debate about opinions. All I am saying is don’t knock it till you try it. Have you red the original paper on relational systems by Codd? It’s a very easy read is its pragmatism is pretty undisputable. Have you red a book about relational theory? Again, don’t knock it till ou try it.

        Normalization is not a trend. Is a set of rules theoretically derived as a solution to well identified problems. I Haven’t seen a single instance of people disregarding these rules and not falling in the old traps that have been known for 50 years. Often times it doesn’t even take them a day. Every single time. It isn’t even funny anymore.

        I could be here all night giving examples, but frankly speaking, anyone that understands why normalization is a godsend, already knows them all. Heck, I am sure if we look deep enough into Usenet, we will find some post saying the exact same things I am writing now.

        What I find bizarre is the present day odd obsession with the type systems as the magical elixir (no pun intended) to fix any and all programming languages. You said in other post: “the client language can to that”. That sounds like: how to deal with bomb explosions. You certainly would prefer to figure out a way to not making said bombs going off? My point being that you don’t need to really on brittle obscure code that someone wrote for data consistency because your databas is essentially built to provide just that. It does that consistently and more efficient than anything that needs to rely o copying data around across processes.

        Rant over I guess. I just hope people don’t go this apparently easier route. It turns nightmarishly more difficult really quick. Hope people can interpret this beyond my arguably negative tone.

        1. 2

          I can say without blinking that that mindset is literally the #1 reason why software products and even companies fail

          I really can’t imagine a company failing because of their data structure choices, or at least not modern companies. I’d love to see some examples of companies that objectively failed because someone chose the wrong DB structure.

          I personally have used SQL for a long time, and it always felt like it was getting in the way. It was designed for ad-hoc querying by analyists, which code is not.

          I have no problem with relational systems, I’d LOVE a flatbuffer-based relational DB. SQL specifically seems to be the problem for me, but I also don’t want to blow another dog whistle here so I won’t talk any more about that XD

          I’m not against normalization by definition, I’m against it in the way that SQL encourages it ATM. Flatbuffers really solves a lot in my eyes because it has defaults that are handled by the code, and the knowledge of “null” vs “does not exist yet, add a default”.

          It was probably my poor choice of words to communicate my thoughts that dropped an opinion bomb. Maybe I should have lead with the above, but oh well.

          But I don’t know why this would overhead alarm bells, DynamoDB has been doing a pretty good job at Amazon, Segment, Tinder, and many more hyperscale companies even powering financial transactions.

      6. 7

        Strongly disagree. First, for the reasons everyone else says. But second, I disagree about the “painful binding” and “not code-native” bits. A normalized database maps very well to a sensible entity model in OO languages, and honestly is probably even easier with functional languages. Yes, you have to deal with migrations, but probably not often, unless you don’t design your entities up-front. Migrations put the cost of maintaining data integrity over code changes in one place, too, rather than forcing you to do ad-hoc cleanup after the fact.

        1. 2

          rather than forcing you to do ad-hoc cleanup after the fact

          But you don’t have to actively delete data if you don’t need to, and it will get filtered out next time you updated re-serialize to JSON (or something else like flatbuffer or protobuf)

        2. 1

          I wish you were right. But databases like postgres don’t even have the most basic support for sumtypes. (having lots of null values does not count)

          1. 4

            PG has enums, like @jamesnvc mentioned, and it has arrays, etc. It doesn’t have every data type around obviously, but you are also encouraged to create your own datatypes specifically for your intended use case.

            So if you can’t use a built-in datatype, you use a third-party extension or roll your own. All 3 options are well supported in PG.

            1. 1

              Enums are not sumtypes. Array aren’t either.

              You can’t create your own datatype as a sumtype. I have the impression that you might not be familiar with the concept of sumtypes.

              1. 2

                sumtypes have lots of definitions, you are not clear on which definition of sumtypes you mean. Based on this comment, I’ll assume you mean tagged unions? If you mean something else, please be specific.

                You can emulate tagged unions in SQL: https://stackoverflow.com/questions/1730665/how-to-emulate-tagged-union-in-a-database

                You can also do table inheritance, which is a PG only thing: https://www.postgresql.org/docs/current/ddl-inherit.html It’s not exactly the same thing, but it might meet your needs.

                1. 1

                  Yes, that’s what I meant. Thank you for clarifying.

                  You can emulate tagged unions in SQL: https://stackoverflow.com/questions/1730665/how-to-emulate-tagged-union-in-a-database

                  Exactly, so as I said: there is not even basic support for them. So they have to be emulated.

                  Now, if I have to emulate them anyways, then I’ll choose the best way of emulation. And in 95% of the cases (in my experience) that is emulation through JSON.

                  1. 3

                    I think this really boils down to, who owns the DB.

                    If you own it fully(i.e. you have control over it) it doesn’t matter. Do whatever you want and makes you feel happy. If that’s a bunch of JSON, have fun.

                    If you don’t fully own the data, then you have to deal with other people. In that use-case, you should make the DB do as much as possible, including SQL tagged unions as needed.

                    I normally find myself in the 2nd camp, I rarely own the data outright. Other people and processes and teams also need the data. Making the DB do as much as possible forces them to not abuse the data too much.

                    In my experience, when #1 is true, that I fully own it, as the data grows and time marches on, it inevitably moves to the second camp anyway(or it really wasn’t important data anyway).

                    1. 1

                      Yeah, I agree with that. I have seen too many problems as that I’d “share” a json column with other people (maybe with few exceptions).

            1. 2

              I suppose those are the most basic sum types, but valenterry is probably wanting tagged unions (Rust-style enums, not C-style enums).

              1. 3

                (Off topic grumpiness about Rust naming their sum types as enums.. I really don’t get why they did that)

                1. 1

                  (It does seem unnecessarily confusing)

    12. 48

      I’m in the Mac ecosystem (MacBook Pro, iPhone) so I use and love NetNewsWire - it syncs read state perfectly across my devices via iCloud and just works.

      1. 7

        I’m also all in on Apple so I use Reeder, and can recommend it unreservedly.

        1. 3

          I was a huge Reeder fan until the author launched a new version that dropped most of the RSS sync options due to a new focus on subscribing to non-RSS sources. The old app is still around as “Reeder Classic,” but it’s effectively in maintenance mode.

          NetNewsWire is slightly clunkier than Reeder Classic, but the author seems more invested in the types of functionality I want to have in an RSS reader.

          1. 1

            Yeah that was super annoying but whichever sync service I used was still supported so it missed me. I should try NNW again.

      2. 4

        Note that iCloud is only one of NetNewsWire’s subscription/unreadness sync options: You can also sync with Basque, Feedbin, Feedly, Inoreader, NewsBlur, The Old Reader, or self-hosted FreshRSS, or just keep the data locally. And like any good RSS app it can export and import subscriptions for a quick tryout.

      3. 3

        I’m also on the Apple/Mac ecosystem, and I do love NetNewsWire (using iCloud to handle subscriptions and read state across my devices). I also do love that I can leave a Quick Note on my iPad using the Apple Pencil on articles (swiping from the right bottom corner up), I use this feature in addition to the starred articles whenever I want to add some personal thoughts to what I’m reading.

        1. 1

          Interesting. Never knew that. Is there any equivalent one for iphone? I now star articles and then batch process export entire article to obsidian to take notes.

          1. 1

            Looks like there isn’t a quick note gesture on iOS. It’s accessed from the share sheet or a control center icon.

            https://support.apple.com/guide/iphone/use-quick-notes-iph5084c0387/ios

      4. 1

        Same, used it way back when and used it ever since they remade or rereleased it.

      5. 1

        I’ve been using it Mac only for years, I never realized there was a iOS app too. Thanks!

    13. 2

      I would like 8k in a 42” panel, so I could run it at 4k@2x. That would be ideal for my purposes. Needless to say, as my former grandboss at Apple used to say to me, “we’d go broke if we made products for you”. But a man can dream.

    14. 1

      I’m interested, especially if it is in fact a native app on the Mac. The real, unquestioned benefits of eg kitty et al are totally overwhelmed for me by the fact that they’re not native apps. I use and pay for Prompt, because it is first and foremost a Mac (and iOS) app.

      1. 1

        What are the specific macOS integrations a terminal emulator should have (Asking as a new macOS and WezTerm user who doesn’t understand what is missing)?

        1. 3

          The first and most important thing is how do you configure it. If you have to edit a text file, I’m sorry, that’s straight into the bin for me.

          1. 15

            I want a text file because I can commit it to a repo. :)

            1. 1

              This is totally reasonable! It’s not a concern for me, but my needs and preferences are, first of all, mine, not everyone’s.

          2. 3

            ghostty is configured via a text file

          3. 1

            I am interested in all of them, I wondering how I can improve my workflow!

            1. 1

              It’s fair! Everybody has their own local maximum!

        2. 3

          Mostly for me is native widgets, and its performance compared to other terminal apps that use electron, qt, etc… text rendering, and resource usage is a big one for me as well.

    15. 5

      The GPL always oozed this uncanny sense of entitlement to me, it feels so antisocial. I’ve always been weary of people employing pre-emptive mechanical systems to “force good behavior” because they simply do not trust their fellow human being. Ironically, it’s created a situation where, the open source movement is massive, it did take over - using permissive licenses like MIT and BSD - and not due to “forcing everyone to play nice” with the GPL. People, even companies simply gave back, without being legally required to. But this isn’t good enough for the GPL crowd, they’re still upset that people are sharing code “without the protection of the GPL”, which has thoroughly proved it’s uselessness. I busted a gut when Stallman decried that using LLDB from Emacs was “an attack on freedom”. I am not shy to say the FSF is a literal cult that believes software itself deserves more rights than the humans who use them, they are utterly oblivious to how our actual human society functions and only see bad people everywhere.

      1. 20

        If the FSF really believed in freedom for end users, they would have spent more time working on end-user programming systems (no, sorry, Guile is not a good end-user programming environment, Scheme is not the language for the masses). The economic benefits to companies of having a second source are clear, but for end users a big blob of GNU C code is no more open than a Microsoft product because it’s simply too hard to make changes.

        1. 11

          I’ve read Free as in Freedom cover-to-cover at least twice. I’m thoroughly convinced the entire FSF movement is simply Stallman trying to re-create the old MIT lab. Not the spirit of it; after all, it’s hard to capture the spirit of a small tight-knit community with a global movement. But because Stallman is genuinely nostalgic for the days when he hacked on a PDP-11 with his collogues, and this is the only solution he could come up with to try and relive it. I feel like that’s why he’s so prone to tantrums over seemingly nothing, because he’s trying to re-create his old home out of a political movement and the cognitive dissonance gets to him. But I feel like this explains the FSF’s total disregard for modern technology and methodology, they romanticize these “good old MIT days” as told by St. IGNUtius, and try to force it into existence with the iconography of the technology of the day, like a cargo cult. Honestly, I’m shocked something like Mastodon exists, even if that also feels like it misses the point.

          I don’t know if I’m just overly skeptical and making up strawmen here. I’m sure I am at some capacity. But the FSF really does just give me this cargo cult feeling, it feels so disconnected and backwards. We’ve had similar discussions on Lobste.rs before about how older projects are becoming more anti-social by failing to adapt to newer communications standards for their teams, and this feels like a continuation of that.

          1. 14

            I don’t think it’s off base to assume a lot of the FSF’s problems come from conflating users and developers - they were one in the same in the AI Lab, after all.

            1. 11

              It decays from there to conflating the set of users with the set of people who either are developers or can afford to pay developers. The latter means that, in most cases, the rights given by the GPL are reserved for big companies and rich people.

            2. 6

              People routinely criticise the free software movement’s ideology as being irrelevant to users who are not developers. This seems to be missing the point. I don’t know how to work on a mobile phone OS, but I still benefit from the freedom to tinker indirectly because other people in the same boat as me do know how to work on a mobile phone OS. Of course I only benefit because I’ve forked out for a phone that lets me install another OS; but if Linux were GPL3 everyone with an Android phone would be able to do that. This specific thing bothers me a lot, because there are functioning phones in landfill that would not be there if their owners had been able to flash an OS that is still updated.

              1. 6

                I think it more likely that if Linux were GPL3 then Android just wouldn’t use it.

            3. 1

              I think that it’a absolutely important for users to be developers.

              Users who do not develop are at the mercy of developers.

              1. 2

                That’s a bit like saying that homeowners that don’t learn how to rewire their entire house are at the mercy of electricians. It’s trivially true but we have incentives in place to ensure that electricians do a safe job at a reasonable price without maliciously electrocuting people.

          2. 6

            Mastodon isn’t a GNU project or related to the FSF in any way, so not sure why it surprises you

            1. 4

              I had an itch when you said Mastodon wasn’t a GNU project so I did a bit of a history dog. You are correct. GNU Social used a different federation protocol called OStatus. OStatus was, apparently, quite frustrating to work with so a group of people came up with ActivityPub and took it through the W3C process. Then GNU Social adopted ActivityPub alongside OStatus. Weeeeee…

              1. 8

                Actually exactly the same people came up with OStatus and ActivityPub (and OMB before either of those). GNU Social never really existed. It’s just that after the maintainers got bored of maintaining StatusNet (nee Laconica) and moved on to pump.io and ActivityPub some of the users renamed it as GNU Social and then made a little space in GNU to keep it on life support, but the project was effectively dead by then.

                OStatus was actually much more efficient and standard for the use cases that are popular on Mastodon. ActivityPump/ActivityPub were created in order to add support for private sharing and other things which are used but not quite so much as public microblogging still is.

        2. 5

          I strongly believe that the myopic focus on IP trickery has lead to an actual decrease in the prevalence of the FSF’s nominal ideal end state. We are not any freer from arbitrary nonsense today than we were in the 80s and computer systems are much more difficult to reason about.

        3. 4

          An irony to your remark about Guile is that if Emacs were written as a small amount of C + Guile instead of Elisp, it would be closer to the Lisp machine environment it is inspired by, which was reputationally a very good end-user programming environment for the era. Or maybe Climacs, if it weren’t defunct. In any event, that’s not what we have. Historical reasons, and whatnot.

      2. 12

        I’m not too worried about my fellow human being, but the big corporations have shown time and time again that they are perfectly happy to take a bunch of open source code, build it into a product, then use that product to violate the rights of the end user, depriving them even of basic freedoms to repair or reinstall software on the device.

      3. 9

        I’ve always been weary of people employing pre-emptive mechanical systems to “force good behavior” because they simply do not trust their fellow human being.

        so, like, uh, laws? people consistently have to be forced into good behaviour compliance. the only reason our society barely works is because of milleniums of constant enforcement of this under various more-or-less enjoyable frameworks.

        1. 9

          This is not how laws work and most societies that try to make laws work like this fail. Making murder illegal does not prevent murders, most people do not commit murder because they think it is a bad idea. Murder is illegal so that there is a framework to deal with the outliers who disagree. Laws are based on a society’s consensus of acceptable behaviour. When laws diverge from this, you see widespread illegal behaviour and eventually a collapse of the rule of law.

          1. 4

            Laws definitely do change norms of behavior though. They just don’t always do so; and it’s clear that the effectiveness of enforcement is a part of how much norms of behavior are changed.

          2. 2

            ?!?!?! What about taxes? People pay them not because they think this is a good idea, but because they are required to do so

            1. 10

              There are lots of ways to avoid taxes. It’s quite easy to move to somewhere that has no taxes, though typically the lack of any of the infrastructure that taxes pay for makes people disregard this as a choice.

              In democratic societies, most candidates have their tax and spending plans in manifestos, you can always vote for the candidate who promises to cut all taxes and reduce public spending to zero. Most people don’t, the consensus view is that taxes are a good idea.

              I don’t know where you live, but polling before the last election here in the UK said said that around 70% of the population would rather see tax increases than reductions in public spending.

          3. 0

            A minor quibble - in both English and Scots law, murder is not illegal.

              1. 1

                Note that is says “unlawfully kills”, not “illegally kills”.

                The former means in violation of (Common) Law, the latter in violation of a Statute. Hence the “minor quibble”.

                1. 2

                  What is your source for believing in this distinction? They are synonymous and not used as terms of art to make that distinction in legal writing.

            1. 3

              I’d like you to expand on that, because a cursory googling suggests that it in fact is.

              1. 2

                Murder is unlawful, not illegal. i.e. it is contrary to Law, not to Statute. What Statute sets are the various penalties for murder and manslaughter.

                It is a technical difference, with no overall effect - one is still imprisoned when convicted of murder.

                Hence the “minor quibble”.

                1. 1

                  Ah, I suspected it was a common law/staturary law quibble.

            2. 2

              In regular English, illegal and unlawful are synonyms. Since David wasn’t speaking in British legalese, he didn’t say anything incorrect about English and Scots law.

              1. 2

                It is not legalese, but is in the similar class to the common use of referring to say apes as opposed to people, despite the fact that man is an ape.

                For most purposes illegal vs unlawful is a distinction without a difference, but “technically correct”.

                It is something one occasionally runs in to when talking to / taking advice from a solicitor, barrister or advocate, when they’re conversing with one in English, rather than “legalese”.

                It is useful to know for pub quizzes. Again, hence “minor quibble”.

                Anyway enough of that, it was simply intended as an amusing interjection.

        2. 2

          I’m fairly disappointed in this comment, it comes off as impulsive. Yes, laws exist. Licenses exist, too. No, not just the GPL, many lawfully-enforced licenses exist. That’s not my point. I’m not asking people not to actively harm each other, that’s what laws are for. I’m saying the GPL is over-reaching to the point of actively distrusting people before they can even prove their willingness to act in a way that’s beneficial for society. It is, in it’s very conduct, an anti-social statement, declaring to the world that one believes they must be forced to publish all modifications to their project, because the author inherently is distrusting. It’s rather unsettling. Laws exist, so do governments that make too many of them.

          1. 10

            they must be forced to publish all modifications to their project,

            No they won’t? You just need to provide the source code (or offer to provide it) if you distribute the binary to someone. This is such a reasonable requirement that I can scarcely believe that someone can argue against it in good faith.

            Here’s an example of the kind of problem it prevents: Apple distributes a modified curl binary that is secretly more lax in checking TLS cert validity. But there’s no way to check exactly what they did because they aren’t obligated to, and don’t, provide the modified sources. This is actively user- and security-hostile. https://daniel.haxx.se/blog/2024/03/08/the-apple-curl-security-incident-12604/

            1. 1

              I won’t claim to be too familiar with the requirements of GPL, but I’m fairly certain, it not only includes the availability of the source code, but also instructions and means to roll your own update, to what ever you have purchased that includes that modified curl binary.

              So if Apple includes that modified curl in the base system of iOS, for instance, they would need to provide you with instructions and means to change the version being used by your devices iOS installation.

              I won’t go into if that’s reasonable or not, but it’s easy to imagine, that that’s a lot more complicated to facilitate, than just providing you with the modified source, dumped on some CDN. Especially, when you need to ensure that the systems security is not compromised by any unauthorized modifications.

              1. 3

                curl isn’t GPL-licensed so this is kind of hypothetical.

                I won’t claim to be too familiar with the requirements of GPL, but I’m fairly certain, it not only includes the availability of the source code, but also instructions and means to roll your own update, to what ever you have purchased that includes that modified curl binary.

                Sort of – the way it works is:

                The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable.

                In my experience, these requirements aren’t too hard to circumvent for practical values of “making modifications” though. Probably the most hilarious example I’ve seen (I don’t know if it ever stood in court, but there was some confidene that it would) was simply not including any “scripts to control the installation of the program” in the first place. Instead, if you purchased the “enhanced support” package or whatever, among other things, you got a long PDF file with a table that said what had to be copied and where.

                The argument was that the program simply required some manual set-up, like a lot of programs do. No part of the source code was being withheld. Pile up enough of these obfuscation measures and someone’s going to pay up eventually.

      4. 6

        Well FSF sure has cultish vibes, but GPL is a license that is still used a ton (Linux) and it for sure contributed to a movement and a discussion around what digital rights individuals should have over what they own. Bsd and mit licenses have contributed too, but saying it’s only them is disingenuous.

        Personally I’m quite q fan of the EUPL, which seems to be a mixture of AGPL and LGPL that remains more enforceable and grounded in eu law (at least one would hope so).

        1. 1

          GPL in Linux’s case is more for the GPL’s real-world use case - it’s used to protect a business. Let’s not forget that software and licensing is just as much a social phenomena as it is a technical one, perhaps even moreso. Linux is at an especially poignant status as an extremely valuable technology used as the foundation for most business, it needs something to legally protect it, whether that’s GPL, an education-only license, being outright proprietary - as long as nobody can take it, make private changes to it and then try to lock people in with their unique version ala Oracle. In that sense, you can say the GPL is doing what it’s supposed to, but not because Linux is the rule, it’s the exception. Most software, most GPL’d software at that, is niche and unprofitable, either it’s a simple application or a library that nobody would even know is being used by a business and has no business being GPL in the first place.

          1. 8

            I see where you’re coming from but… Linux is about as unprofitable as it gets, too. That’s why its development model has settled into this “corporate commons” model, where it’s primarily being developed by companies which have a mutual interest in sharing what, to them, is highly unproductive software infrastructure cost.

            It’s services (either software, as in what we call SaaS, or “corporateware”, for lack of a better term, i.e. support, consulting etc.) or hardware running Linux that are profitable. I’m using hardware pretty broadly, too – I don’t mean just e.g. Intel or Samsung, who are selling hardware that’s literally running Linux, but also e.g. Analog Devices, who sell hardware that goes into devices running Linux. Both the hardware and the software are usually proprietary (or really unprofitable :-) ).

            There was a time when e.g. NetBSD would’ve been a reasonably safe bet, too. There weren’t too many vendors trying to lock their customers in with their NetBSD-based Windows CE alternatives.

            I do remember a bunch of BSD-running hardware that ran unpublished code. That sucked but the GPL isn’t magical upstream dust, either. Most of the Linux hardware I’ve touched or worked on might as well have been closed-source. 90% of the time you get a frankenkernel and source packages for whatever gets on the rootfs. Even if you can turn that into a viable development environment, it’s not like you can load your code onto the device and run it. In practical terms it’s about as open as if it were running QNX.

            1. 3

              Even if you can turn that into a viable development environment, it’s not like you can load your code onto the device and run it.

              Which to be fair, would not be a problem if Linus had adopted the GPLv3, which was designed for this exact purpose.

              I have a 3D printer (Creality K1 Max) that uses Klipper as a firmware. Klipper is GPLv3 licensed, so Creality were forced (unwillingly!) to actually allow people root access so they can load modified firmware. And I am very happy that they were forced against their will to do so, it’s really raised the value of the printer for me to know that the community can fix things even when the company has moved on to the next greatest model. Which makes sense - the “device driver” type of software is the core defining purpose of the GPL, after all.

              1. 6

                Which to be fair, would not be a problem if Linus had adopted the GPLv3, which was designed for this exact purpose.

                Oh, it certainly wouldn’t be a problem, because if Linus had done so, only devices in the hobbyist market would be running Linux nowadays :-).

                Back when, for my sins, I had been cast in the seventh circle of hell that is Linux BSP consulting, “no GPLv3 code” was a requirement on the same level as “the code compiles”. I thought this was bollocks back in the day, too, but not all of it is under technical control. In some regulated industries you’re literally not allowed to sell a device that (easily) allows unrestricted firmware sideloading. It sucks, but it’s the kind of thing that you can’t settle between the FSF and its software’s users. Given a choice between taking it up with the FDA or the DRM alliance and just using an operating system where that’s not a problem, any commercial user will go with the latter.

                Ultimately, in order to make their software usable for people who write software to pay their bills, even the FSF had to relent on some things – GNU libc isn’t GPLv3-licensed and the GCC runtime library has a runtime exception, for example. The unfortunate reality, at least for the tivoization clause, is that most of the software that ended up licensed under GPLv3 is software where tivoization isn’t a problem in the first place.

                Edit: I don’t like this but it’s just how things are, at least for the time being. I’d love to be able to securely load my own firmware onto my devices and wherever that’s a concern of practical relevance for me, I try to buy devices where that’s a thing. But the vast majority of the consumer and industrial market – which is where the vast majority of embedded Linux consulting shops operate – is, sadly, not in that boat.

                1. 1

                  Many consumer devices are not regulated in this way. And if a device doesn’t let me root or run custom firmware, and just hosts a locked-down frontend, then there’s no benefit to me in having Linux run on it - so why should I care that the corporation has to put more work in to write their own OS?

                  So I think it’s simply empirically not true that only hobby devices would run Linux: plenty of vendors don’t want to allow root access, but also don’t greatly mind being forced to do so. They do the cost-benefit tradeoff and it often still comes out well positive.

                  The point to me of GPLv3 is this: those devices that can be legally user-modified, should be able to be user-modified. The GPLv3 incentivizes this by giving the vendor code for free, which certainly nobody was required to grant them: it’s a donation with strings attached. I don’t see anything problematic with that as a user or a developer. It sucks for the vendor, of course, but … well, given my experience with what they tend to use locked-down platforms for, namely hacky custom apps that mostly present ads for the vendor’s family of subscription services and send eminently local data over dubiously secured remote servers in order to gather marketing data, I have very little sympathy for their plight. (“Sent from my LineageOS phone…”)

                  1. 2

                    I’m not arguing for the current status quo here. If you ask me, sure, all devices, including regulated devices, should be legally user-modified, with reasonable protection in place for all parties, including the vendors (i.e. if you flash your own firmware on your insulin pump and you die, that’s on you and your relatives can’t sue whoever made the pump; but by all means you bought it, you own it, you should be able to flash your own firmware).

                    But that’s not the world we live in. GPLv3 just means more strings attached to the same thing – “donated” code – so there’s no wonder vendors, who have to foot the legal bill, weren’t exactly enthusiastic about it. Does it suck? Absolutely. But Linux is an operating system, it’s going to thrive only insofar as there are devices that run it and that people care about. Fewer vendors using Linux in this space is worse, not better.

                    Plus, only half facetiously: if GPLv2 is good enough for GNU Hurd, maybe it’s not entirely bad for Linux, either ;-).

            2. 1

              90% of the time you get a frankenkernel and source packages for whatever gets on the rootfs.

              Even that’s optimistic for SOCs.

      5. 3

        I have almost the opposite sense from you. Once money is involved, every system becomes “give them an inch, and they’ll take a mile”, at least for the worst actors, and there’s always incentive to get worse. So pre-emptive enforcement sounds good to me. The problem for me is that the GPL never really got at the heart of what needed to be enforced (and which probably can’t actually be enforced only by a copyright license).

    16. 19

      I just want a goddamned user agent that acts on my behalf.

    17. 14
      1. 2

        I was coming here to post exactly this. It’s going to make some stuff we do a lot less weird.

    18. 4

      I should revisit devenv. Our current project has a lot of out-of-band requirements and we’re currently making do with a pile of shell and homebrew and such, but a more principled approach here would be super welcome.

    19. 5

      I really miss this form factor of PC, little netbooks were my first introduction to Linux and programming. My childhood bedroom wardrobe is full of them!

      1. 4

        There’s the MNT Reform and Pocket Reform…

      2. 4

        I am forever disappointed that Apple doesn’t make any MacBooks this size with an M1 chip. Microsoft makes the 10 inch Surface Go, but only with some of the absolute worst Intel CPUs available (poor performance AND bad battery life, impressive).

        1. 5

          I am forever disappointed that Apple doesn’t make any MacBooks this size with an M1 chip. Microsoft makes the 10 inch Surface Go, but only with some of the absolute worst Intel CPUs available (poor performance AND bad battery life, impressive).

          I loved my 11” Air. I wish they’d do another one of those!

          1. 2

            The 13” Framework is about the same size, being only around 1.25” deeper - but with a much larger display.

          2. 1

            Mine still gets some use, especially for travel. Even got a recent OS patch, though I think those are drying up.

            1. 1

              My kids destroyed ours, sad to say.

        2. 2

          Me too!

          It’s funny that back when hardware was way worse, manufacturers took way bigger risks with form factors.

          Now we have the fancy chips to actually pull it off and they aren’t interested 😭

          1. 2

            chips are way better (modulo forced background code) but chassis hardware is way worse. the industrial system is no longer set up to produce high quality chassis components so the “risk” of making a new form factor work well amounts to massively multiplying the manufacturing cost, when the success of most products depends on price-sensitive consumers.

      3. 1

        They’re still around - huge in the education space. Almost every major OEM sells some variant of an 11in Alder Lake (N100, etc) or later laptop for under $300.

        They’re not great but they’re usable.