Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(github): GitHub App installation support #169

Merged
merged 3 commits into from
Dec 10, 2024

Conversation

vit-zikmund
Copy link
Collaborator

Hello dear @rufuspollock & @athornton, long time no see :) I was missing you so much that I eventually ended up doing this beefy github auth upgrade! It adds support for GitHub App installation tokens, which is something I didn't have any knowledge about (lucky me) until I wanted to use a CICD automation (that uses a GitHub App as the governing identity) to act upon a LFS-enabled repo. Pretty nice rabbit hole I got stuck in for a several days.

Here's the feature on a rather nice silver plate. Let me know if you find any sharp corners on it, but it's proven to work in our premises. Below you'll find the the main commit description, slightly unraveling the core changes. See the updated docs for the details about this funny github token flavor.

There is this special kind of GitHub access tokens which belong to the GitHub App installation (GitHub App installed in your GitHub organization, doing stuff on it's own behalf, like various bots). Compared to the more usual Personal Access Tokens, these tokens have entirely different way of managing their permissions and getting them from the GitHub API.

This set of changes introduces support for those tokens, which in turn led to a couple of new design decisions and some refactoring, splitting the original GithubIdentity into two subclasses, each dealing with their own authentication/authorization scheme, while still leveraging the common caching capabilities, now factored out to the parent class.

Each of the subclasses has an 'authenticate' class method, which creates an instance of the particular class, when properly authenticated against the GitHub API. This identity is then paired with its access token in a cache for later reuse.

In turn, the instance has a private '_authorize' method which provides means for obtaining the targeted org/repo permissons. These permissions are supposed to be stored in a per-identity cache via the call to '_set_permissions'. Due to the nature of the permission discovery for the new GitHub App installations, multiple repo permissions (of a single org) can be cached in a single repo. Also, when this identity has access to all organization repos, a special org/None permission is used to cover them all as a single entry.

There's also a slightly unrelated addition, which is a configurable request timeout for the GitHub API calls, starting with some rather benevolent defaults. This should bring more stability to the module, as potentially hanging requests to the API won't hog the threads forever.

Here are some more important milestones in this squash:

  • feat(github): extract username from the Basic authorization header

To be eventually used to identify an application id with an "app installation" github token.

  • chore(github): move request session management under CallContext

This is needed for further design improvements where GithubIdentity objects will be able to call their specific GitHub API calls.

  • chore(github): move CallContext to the module scope to hold a requests session

  • feat(github): support GitHub App installation tokens

  • chore(github): provide unit tests for GithubAppIdentity

  • feat(github): use a timeout for GitHub API calls

  • docs(github): GitHub App installation tokens

  • feat(github): casually cache GitHub App repo permissions

Also: Somehow reflect on user caching preferences for the auth proxy cache, providing a hardcoded minimum for proxied entries.

GithubIdentity._set_permissions got a new flag 'casual' for writing to the main auth cache without any guarantee this will be retrieved later. This is required so casually cached permissions for GitHub Apps won't get stuck in the proxy cache.

vit-zikmund and others added 2 commits November 19, 2024 11:56
There is this special kind of GitHub access tokens which belong to the
GitHub App installation (GitHub App installed in your GitHub
organization, doing stuff on it's own behalf, like various bots).
Compared to the more usual Personal Access Tokens, these tokens have
entirely different way of managing their permissions and getting them
from the GitHub API.

This set of changes introduces support for those tokens, which in turn
led to a couple of new design decisions and some refactoring, splitting
the original GithubIdentity into two subclasses, each dealing with their
own authentication/authorization scheme, while still leveraging the
common caching capabilities, now factored out to the parent class.

Each of the subclasses has an 'authenticate' class method, which creates
an instance of the particular class, when properly authenticated against
the GitHub API. This identity is then paired with its access token in a
cache for later reuse.

In turn, the instance has a private '_authorize' method which provides
means for obtaining the targeted org/repo permissons. These permissions
are supposed to be stored in a per-identity cache via the call to
'_set_permissions'. Due to the nature of the permission discovery for
the new GitHub App installations, multiple repo permissions (of a single
org) can be cached in a single repo. Also, when this identity has access
to all organization repos, a special org/None permission is used to
cover them all as a single entry.

There's also a slightly unrelated addition, which is a configurable
request timeout for the GitHub API calls, starting with some rather
benevolent defaults. This should bring more stability to the module, as
potentially hanging requests to the API won't hog the threads forever.

Here are some more important milestones in this squash:
* feat(github): extract username from the Basic authorization header

To be eventually used to identify an application id with an "app installation" github token.

* chore(github): move request session management under CallContext

This is needed for further design improvements where GithubIdentity objects will be able to call their specific GitHub API calls.

* chore(github): move CallContext to the module scope to hold a requests session

* feat(github): support GitHub App installation tokens

* chore(github): provide unit tests for GithubAppIdentity

* feat(github): use a timeout for GitHub API calls

* docs(github): GitHub App installation tokens

* feat(github): casually cache GitHub App repo permissions

Also: Somehow reflect on user caching preferences for the auth proxy cache, providing a hardcoded minimum for proxied entries.

GithubIdentity._set_permissions got a new flag 'casual' for writing to the main auth cache without any guarantee this will be retrieved later. This is required so casually cached permissions for GitHub Apps won't get stuck in the proxy cache.
I don't get it, it was all ok the other day. Welp.
@rufuspollock
Copy link
Member

@vit-zikmund wow this is amazing. Thanks for all the hard work.

@athornton would you be up for giving this the once over and then we can merge if you think it looks good?

@athornton
Copy link
Collaborator

Yeah, hopefully I'll have some time tomorrow.

Copy link
Collaborator

@athornton athornton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor documentation suggestions, and one request for a comment in a place I found the action confusing, but on the whole this is fantastic.

docs/source/auth-providers.md Outdated Show resolved Hide resolved
docs/source/auth-providers.md Outdated Show resolved Hide resolved
docs/source/auth-providers.md Show resolved Hide resolved
giftless/auth/github.py Show resolved Hide resolved
giftless/auth/github.py Outdated Show resolved Hide resolved
giftless/auth/github.py Show resolved Hide resolved
giftless/auth/github.py Outdated Show resolved Hide resolved
giftless/auth/github.py Show resolved Hide resolved
tests/auth/test_github.py Outdated Show resolved Hide resolved
@rufuspollock
Copy link
Member

@vit-zikmund we'd love to merge this. wdyt of @athornton comments.

Big 🙏 to @athornton for the excellent review and 👏👏 to @vit-zikmund for the PR 🙌

@vit-zikmund
Copy link
Collaborator Author

Hi @rufuspollock I want to address them all, after all they are either sensible small improvements or nice endorsements 😊
I was under the water a bit last week, so couldn't get to it, but I'll promise to do it soon! Thanks for your patience 😇

@vit-zikmund
Copy link
Collaborator Author

Here are the last polishments as you requested. Thanks for the nice review!

@athornton
Copy link
Collaborator

@vit-zikmund I don't remember if you have merge permissions, but feel free, and if you don't, let me know and I will.

@vit-zikmund
Copy link
Collaborator Author

Nope, can't merge this myself, heh :)

@athornton athornton merged commit fc81f36 into datopian:main Dec 10, 2024
7 checks passed
@athornton
Copy link
Collaborator

Merged!

@vit-zikmund
Copy link
Collaborator Author

wheee! 😄

@vit-zikmund vit-zikmund deleted the github-app-inst branch December 11, 2024 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants