GitHub Actions could be so much better

39

GitHub Actions could be so much better programming rant blog.yossarian.net
authored by yossarian 1 year ago | caches
Archive.org Archive.today Ghostarchive
| 48 comments

48

1. 50
  
  carlana 1 year ago | link
  
  In a large part because, at GitHub’s size, I worry much less about private equity enshittifying it.
  
  I would argue that the little ad for Copilot that now appears on GitHub file pages is the first step of the enshittification cycle. “We need to maximize engagement with Copilot. How do we do it? Oh, I know, we can add a CTA to the content pages…” is the first step on the road to ruin. Hell, the whole homepage is just another social media feed at this point.
  1. 18
    
    yossarian 1 year ago | link
    
    Hell, the whole homepage is just another social media feed at this point.
    
    Yeah, this has driven me nuts – I was a regular user of the homepage to track relevant activity (people I follow, people doing things to my own repos), and the new feed design has more or less taken that away.
    1. 1
      
      mk12 1 year ago | link
      
      I hated the new homepage too but I realized if I change the filter (in the top-right, no need to dig in settings) to only include Sponsors, Stars, Repositories, and Follows, it becomes much more like the old homepage.
  2. 12
    
    hwayne 1 year ago | link
    
    Reminds me of how the windows widgets have a news feed you can’t disable, and they recently updated it to show fewer widgets and more of the news feed.
  3. 11
    
    toastal 1 year ago | link
    
    Self-plug: I’m maintaining a filter list to hide Microsoft GitHub’s UI garbage if you’d like it to be less shitty.
    
    https://sr.ht/~toastal/github-less-social/
  4. 5
    
    msfjarvis 1 year ago | link
    
    I don’t know how or why this exists, but a slightly wonky variant of the old linear feed can be viewed at https://github.com/dashboard-feed
    1. 3
      
      carlana 1 year ago | link
      
      I assume the old page was pulling that in with AJAX and it will go away once they are confident all the old opened dashboards are closed.
  5. 1
    
    brendan edited 1 year ago | link
    
    Hell, the whole homepage is just another social media feed at this point.
    
    If you’re frustrated by the the home feed changes (like I am), please give feedback on their discussions board, and upvote relevant threads, and encourage others to do the same. In the meantime there’s userscripts and extensions you can use to revert it locally.
2. 20
  
  masklinn 1 year ago | link
  
  These half-assed programming languages in yaml are tremendously frustrating, I keep wondering how much better and simpler things would be if instead of that it were an actual programming language, with a declarative bend. The thing is too weak to be fully useful on its own and has to embed shell scripts anyway, at that point why not have an Elm-like language except instead of creating the integration of an event stream into a web application it simply maps an Event into a Job, and the entire thing is completely typed, and you can have a “standard library” of utilities which you can actually look up, and you can define your own common utilities and variables?
  
  And you can run the compiler locally so you at least know what you wrote is not complete nonsense, even if your shell scripts may be broken.
  1. 12
    
    david_chisnall 1 year ago | link
    
    It’s worse than that. Because the YAML-based language is so anaemic, the extension language (the thing that you use to define new functions) is JavaScript. So you already have a dependency on V8 and Node.js to be able to actually run any actions. At that point, I don’t know why they don’t just expose the actions as modules and ask you to write TypeScript to drive the CI. I’d much rather use TypeScript with a well-defined API than YAML.
    1. 2
      
      algernon 1 year ago | link
      
      Because not all actions are in JS. There are plenty in shell, and iirc, go actions are also supported.
      1. 5
        
        david_chisnall 1 year ago | link
        
        Not all are JavaScript, but a bunch of the core ones are. You can run github-act-runner without Node.js, but you have to provide your own checkout command because that’s JavaScript. This means that you basically have an unavoidable Node.js dependency. If you’re going to do that, why not make TypeScript the scripting layer? If people want to provide actions in other languages, they can do so as Node native plugins.
        
        3
        
        algernon 1 year ago | link
        
        It’s not just about having a NodeJS dependency. It’s also about maintaining one’s own actions - and having other options than JS there is a good thing. I do not want to write any kind of JS for the actions I maintain. Not even glue code. I can do that right now, and that’s fine. I’m ok with having a NodeJS dependency, if I don’t need to touch any JS code.
        
        If anything, I’d love to see more languages supported, perhaps even a generic way to provide pre-compiled actions, so I wouldn’t need to resort to publishing my precompiled actions as a docker container, and then use that from within action.yml (which limits the platforms I can run it on, among other things).
        
        3
        
        david_chisnall 1 year ago | link
        
        It’s not just about having a NodeJS dependency. It’s also about maintaining one’s own actions - and having other options than JS there is a good thing
        
        Totally agreed.
        
        I do not want to write any kind of JS for the actions I maintain.
        
        Okay…
        
        Not even glue code. I can do that right now, and that’s fine. I’m ok with having a NodeJS dependency, if I don’t need to touch any JS code.
        
        Instead, you need to touch an absolutely awful DSL in YAML to use your action. There’s nothing preventing the actions glue code from allowing you to export some API description and generating the TypeScript exports from that so you don’t need to write TypeScript to write a new action, but if they ditched the YAML and exposed the JavaScript VM (that must be there anyway) for scripting then the interface would be a lot nicer.
        
        2
        
        algernon 1 year ago | link
        
        Instead, you need to touch an absolutely awful DSL in YAML to use your action.
        
        I don’t find the DSL awful. Mind you, I use very little of it, because 99% of the logic I need for my builds, are in actions. So my workflows are like:
        
        steps: - name: foo uses: my-actions/something@version with: param1: bar
        
        Repeat something similar a few more times, done. There’s pretty much no shell steps, nor anything complicated, just a pretty flat and stupid yaml. Not sure how JS would make this better?
        
        const core = require('@actions/core'); const something = require('@my-actions/something'); core.tryStep({ name = "foo", action => () => { try { something.run(); } catch (error) { core.setFailed(error.message); } });
        
        Something like that, I guess? I’ll stick with the yaml, thanks. I prefer my workflows to be declarative, rather than code. YAML isn’t perfect, but for declarative things, I think it’s a better fit than most programming languages.
        
        Now, if you want the full power of a proper language, and don’t care if your workflows aren’t declarative, then yeah, yaml is horrible. I’m not in that camp, so yaml is fine for my use cases.
        
        if they ditched the YAML and exposed the JavaScript VM (that must be there anyway) for scripting then the interface would be a lot nicer.
        
        From what I can tell after browsing the runner source, it just shells out to nodejs like for any non-js action.
        
        1
        
        david_chisnall 1 year ago | link
        
        I think you are probably in a minority in being able to express your builds in purely declarative YAML. The YAML language supported by actions supports a lot of imperative constructs for conditional execution. All of these are handled by consuming a string in YAML and evaluating yet another DSL.
        
        1
        
        algernon 1 year ago | link
        
        Indeed, I probably am. Nevertheless, a CI can provide multiple ways to specify a workflow, so those of us who can express them in a purely declarative manner, can do so, while those who need a proper programming langue, would be able to do just that.
        
        I wouldn’t hold my breath for GitHub Actions to ever get there.
        
        1
        
        david_chisnall 1 year ago | link
        
        The problem I have is that you’re not really expressing a declarative CI pipeline, you’re expressing an imperative CI pipeline in custom actions and then using their weird language to provide configuration for that pipeline. With a JavaScript / TypeScript scripting language, there’s nothing stopping you from exposing a single entry point to your custom actions that consume a JSON document and keeping the declarative structure in the JavaScript. In both cases, you’re using a declarative subset of an imperative language to drive an imperative language.
  2. 5
    
    orib edited 1 year ago | link
    
    They’re best used by ignoring them as much as possible, calling out to a shell script or makefile or whatever your tool of choice is. Treat them as an entry point into something acceptable.
    
    And then you can run that acceptable thing locally, or in some other CI, or anywhere else you want.
  3. 4
    
    wiktor 1 year ago | link
    
    Thanks for writing this up. I’ve been thinking the same thing recently that we’ve got such a great tooling when doing programming but DevOps and CI world is riddled with ugly, untyped Yaml. It’s just sad.
    1. 5
      
      toastal 1 year ago | link
      
      Nickel & Dhall can save the day. Why these aren’t the preferred native formats fully-typed and reading in a user opting for the worse YAML experience is beyond me.
      1. 3
        
        asymmetric 1 year ago | link
        
        Nickel literally just reached 1.0, Dhall’s type system is non-gradual. Just a couple of reasons why they might not be mainstream yet.
        
        2
        
        toastal 1 year ago | link
        
        Yet still a better experience than YAML 🙂
  4. 2
    
    N64N64 edited 1 year ago | link
    
    I’ve been using Lua for a while, and I am progressively moving more and more into Lua. At this rate I don’t think I will ever go to another format as long as I live. It’s too OP
  5. 1
    
    danb 1 year ago | link
    
    There’s a good number of vendors attempting to create their own languages for specific niches. What’s more interesting to me is novel application of existing languages. E.g. what if GitHub actions just invoked an entry point you defined, then your program (written in the same popular language you’re already familiar with) calls back to an API to define the next tasks and jobs.
3. 16
  
  carlana 1 year ago | link
  
  Debugging like I’m 15 again
  
  Lol, perfect description of how every new workflow file gets created.
4. 12
  
  Corbin 1 year ago | link
  
  I want to mention act, a terribly-named tool for running GH Actions locally prior to push. It does not always work, but sometimes it can save time when debugging Actions.
  1. 4
    
    teymour 1 year ago | link
    
    There’s also gha-runner which Pernosco (the rr debugger developers) have developed and use to run GA workflows on AWS.
5. 11
  andyc edited 1 year ago | link
  By the way, you can just write plain-ass shell scripts and run them in your CI. They run locally, giving your PHP-like productivity where you iterate 50 times a minute. Not one iteration every 5 or 50 minutes :)
  
  Our github config just lists 11 or so jobs:
  
  $ wc -l .github/workflows/* 371 .github/workflows/all-builds.yml
  
  Same with sourcehut:
  
  $ wc -l .builds/* 15 .builds/dummy_orig.yml_disabled 61 .builds/worker1.yml 68 .builds/worker2.yml 49 .builds/worker3.yml 49 .builds/worker4.yml 242 total
  
  https://github.com/oilshell/oil/tree/master/.builds
  
  Then we have tens of thousands of lines of tests and benchmarks, that run in both Github and sourcehut, in containers:
  
  ~/git/oilshell/oil$ wc -l test/*.sh benchmarks/*.sh ... 128 test/spec-version.sh 232 test/stateful.sh 411 test/syscall.sh ... 345 benchmarks/uftrace.sh 176 benchmarks/vm-baseline.sh 18295 total
  
  Our subdomain name is funny – http://travis-ci.oilshell.org/github-jobs/ – and proves that the generality was not speculative ! We moved off Travis CI when it rotted after getting acquired. It was literally zero disruption since we can always run in multiple CIs. (obviously should rename that domain)
  
  Now I understand that some people look at 20,000 lines of shell and get scared.
  
  (1) Well first, this is a huge amount of functionality, which not every project has. All these tests, benchmarks, metrics, docs, etc. running at every commit:
  
  https://www.oilshell.org/release/0.18.0/quality.html
  
  e.g. Python and C++ linting, Python/C++/shell unit tests, Address sanitizer, UBSAN, Clang coverage, comparing GCC/Clang, cachegrind, uftrace, bloaty, etc. – basically every kind of code quality you can think of
  
  And HTML results to browse
  
  (2) The point of https://www.oilshell.org is to make 20,000 lines of shell less scary – more like a real programming language, which is necessary for CI
  
  We even have Hay Ain’t YAML so you can interleave shell and declarative JSON-like data seamlessly
  
  It’s all parsed statically so you get your code syntax errors and your data syntax errors all at once.
  
  I showed that in the recent thread on YAML – there is no “else-me-no-care” language anti-pattern. And again no textual pasting of languages together.
6. 7
  david_chisnall 1 year ago | link
  There are a lot of things I don’t like in GitHub Actions. Most other CI systems I’ve used have a client-server architecture where the runners are dumb and just run commands. GitHub Actions is rebadged Azure Pipelines and it has a model of a fat agent on the machine that runs the entire pipeline. This is good for a cloud provider because each runner is just a pristine VM but that has a bunch of knock-on effects, most notably:
  
  It’s hard to bring up a new platform because you need to port the runner (which depends on both .NET and Node.js).
  
  It’s inherently stateless, so caching layers need to be built on top.
  
  The nicest CI system I’ve seen used ZFS snapshots and clones for each build. When you wanted a new build, it would find the newest snapshot in your git history, clone that, and then pull your changes on top and do an incremental build. You could then do periodic checks for clean builds (and use clean builds for releases) but if you had a small bug fix on a previous commit then it ran that almost instantly. You can build this kind of thing out of containers on GitHub Actions, but it requires building all of that infrastructure yourself, in a way that likely breaks a load of assumptions of other Actions.
7. 6
  
  knl 1 year ago | link
  
  Pretty great writeup, thanks!
  
  I maintain one github action, and testing it is a nightmare for the reasons outlined in “debugging like I’m 15 again”. I can’t run it locally. My action queries github to get state of various repos, and I can’t mock that either, so testing is pretty much manual with eyeballing what could be wrong. The whole ecosystem could be made so much better…
8. 6
  
  pl 1 year ago | link
  
  Honestly I still don’t understand how GitHub managed to build something like GitHub Actions after seeing what GitLab-CI does. It’s not perfect by no means, and it will never be, but the non-trivialities of such Ci system are almost infinite and building something secure, performant and widely usable requires some pretty involved thought process - little of which went into GitHub actions.
  1. 5
    
    david_chisnall 1 year ago | link
    
    Honestly I still don’t understand how GitHub managed to build something like GitHub Actions
    
    They didn’t. Azure DevOps built Azure Pipelines. GitHub rebranded it and made small tweaks but the GitHub Actions Runner is a fork of the Azure Pipelines Agent. Azure Pipelines was designed to be tightly integrated with Azure and easy to add your own Azure Scale Sets instead of the hosted versions (but pay MS either way).
    1. 8
      
      vtbassmatt edited 1 year ago | link
      
      Azure DevOps built Azure Pipelines. GitHub rebranded it and made small tweaks but the GitHub Actions Runner is a fork of the Azure Pipelines Agent.
      
      This is true.
      
      Azure Pipelines was designed to be tightly integrated with Azure and easy to add your own Azure Scale Sets instead of the hosted versions (but pay MS either way).
      
      This is false. All the Azure-attach stuff was added much later, and was not part of the core design.
      
      Azure Pipelines, née “Build vNext”, was designed as a reaction against the “bet the farm on Windows-only tech” in the previous TFS build system(s). The explicit goal was to build a platform-agnostic, fairly-dumb agent orchestrated by a cloud service. The first version of the vNext agent was written in Node to help provide that separation from the “MS ecosystem”. (We did take a bet on .Net Core when we productized the system, leading to the downsides you mentioned in another thread.)
      
      I’m not really objecting to your implicit critique of the result, but I did want to correct the record of facts. Source: I was the PM to own the platform pieces after the original PM who helped design it got promoted to, effectively, architect. I think you and I even corresponded when we were trying to get a BSD fork of the agent going.
      1. 2
        
        david_chisnall 1 year ago | link
        
        Thanks. I’m a bit confused by this:
        
        The explicit goal was to build a platform-agnostic, fairly-dumb agent orchestrated by a cloud service.
        
        I don’t see how you get from that as a goal to the code that shipped. It took a load of dependencies on not-very-portable things, but it was also not written in a portable style. For example, there was no platform abstraction layer to centralise the things that differed between platforms. Things like ‘is the filesystem case sensitive’ are done on explicit ‘is the OS Linux’ checks scattered throughout the codebase so porting to a new platform required a load of cleanup. This is the kind of thing I’d expect in code written for one platform followed by a quick-and-dirty port to something else, not something designed to be platform agnostic.
        
        I’m also not sure how you build something ‘fairly dumb’ that’s so large. Competing systems just expose a simple command runner (which can just be SSH on *NIX systems) and run the scriptable bits centrally. The decision to run all of the plugins on the agent on the target made things a lot more complex.
        
        I think Azure Functions launched after this, which is a shame because it would have been nice to build a richer runner that runs as a FaaS task and just talks via fairly dumb command pipe to the clients.
        
        1
        
        vtbassmatt 1 year ago | link
        
        It took a load of dependencies on not-very-portable things
        
        That’s partly a function of taking a bet on .Net Core when it was in its infancy. I wasn’t there for that part, but was told that it wasn’t a technical decision, if you catch my drift.
        
        but it was also not written in a portable style. For example, there was no platform abstraction layer to centralise the things that differed between platforms. Things like ‘is the filesystem case sensitive’ are done on explicit ‘is the OS Linux’ checks scattered throughout the codebase so porting to a new platform required a load of cleanup.
        
        No disagreement from me on this. I’ve always chalked it up to a misunderstanding, or perhaps internal disagreement, on what “cross platform” was supposed to mean. To be fair, though, I’m nearly certain that “portable” was not a design goal. I probably shouldn’t have said platform-agnostic, when the truth is closer to “runs on the three platforms customers asked for at the time”.
        
        I’m also not sure how you build something ‘fairly dumb’ that’s so large. Competing systems just expose a simple command runner (which can just be SSH on *NIX systems) and run the scriptable bits centrally.
        
        I agree the result as-implemented is complex, but disagree that the design is large. The idea was to put all the brains in the (user-selectable, marketplace-delivered) tasks. The agent only does a few things: download task packages from the orchestrator, execute them in the order told, and ship some info (both streamed and batched) back up to the orchestrator.
        
        Compounding the early places where the implementation is complex, the design also got more complex as it accreted features. I guess that’s inevitable in a product that has both competitors to compete with and the aforementioned non-technical pressures.
9. 5
  
  kornel 1 year ago | link
  
  I feel the same about all hosted CI. Programming in YAML, with minutes-long latency, is awful.
10. 4
  
  Tenzer 1 year ago | link
  
  I often wonder how much nicer workflows would have been to write if it they were written in HCL like they first set out before they switched to YAML for the GA version.
  
  I was too late to adopt GHA to try it out :/
11. 3
  
  ubernostrum 1 year ago | link
  
  It’s infuriating when I git push a workflow that silently fails because of invalid YAML; especially when I then merge that workflow’s branch under the mistaken impression that the workflow is passing, rather than not running at all.
  
  I think GitHub should provide a way to check this on their end, but one thing that will also help on the user’s end of things, and that I generally recommend as a best practice anyway, is to set up pre-commit and turn on some of the built-in hooks. There are checkers in there which can reject syntactically-invalid files of several formats: JSON, TOML, XML, YAML.
12. 3
  
  madhadron 1 year ago | link
  
  One of my favorite prototypes I built was to take Temporal and build a CI system in it: have it clone a git repo, run a build command, run tests, parse the results, and post them to GitHub as annotations. The most annoying part was that Python (which I was doing the prototype for) doesn’t expose its unit test runners as a library.
  
  But it took two days to prototype from scratch and get a real program. I’ve spent that long futzing with YAML to try to get it working. I’m not convinced that dedicated CI systems even make sense anymore after that.
  1. 1
    
    bb010g 1 year ago | link
    
    Is https://www.temporal.io/ the Temporal you’re talking about?
    1. 2
      
      madhadron 1 year ago | link
      
      Yes. It’s a remarkably useful system.
13. 3
  
  enterprisey 1 year ago | link
  
  I find https://github.com/rhysd/actionlint, which catches a heck of a lot of common errors, invaluable.
14. 2
  
  l0b0 edited 1 year ago | link
  
  Give us an interactive debugging shell
  
  Use GitLab, where there’s a lot less magic involved in the CI setup. For a lot of cases you can simply run the commands in the specified Docker container.
  
  Give us a repository setting to reject commits with obviously invalid workflows
  
  Try the check-jsonschema pre-commit hook.
  
  Because GitHub fails to distinguish between fork and non-fork SHA references, forks can bypass security settings on GitHub Actions that would otherwise restrict actions to only “trusted” sources
  
  IIUC this is typically solved with if: github.repository == 'my-user/my-repo-name'.
  1. 1
    
    algernon 1 year ago | link
    
    Forks can remove the if, so that doesn’t help much.
    1. 1
      
      l0b0 1 year ago | link
      
      That’s … a very good point. I’ll have to ask my colleagues why this pattern was introduced; maybe it’s for some other reason.
      1. 1
        
        algernon 1 year ago | link
        
        It’s useful when you can trust your forks, to skip running things there that would fail on a fork. It’s useless as a security measure.
15. 2
  
  honeyryderchuck 1 year ago | link
  
  Use gitlab CI