-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix github API requests after asset upload #771
Conversation
Travis tests have failedHey @mfonville, 1st Buildpython setup.py test
4th Buildpython setup.py test
|
Travis tests have failedHey @mfonville, 1st Buildpython setup.py test
4th Buildpython setup.py test
|
c6a7acb
to
f7cd6a2
Compare
Travis tests have failedHey @mfonville, 1st Buildpython setup.py test
4th Buildpython setup.py test
|
@sfdye it's now done, including the tests and should be ready to merge :-) |
It seems the tests still failed? I am on mobile, I may be wrong |
Travis tests have failedHey @mfonville, 1st Buildpython setup.py test
4th Buildpython setup.py test
|
My last commit, a small subsequent improvement, made it fail for the workaround that was introduced for issue #80, I am now trying to include a fix for that too. |
Should now be good to go :-) |
Travis tests have failedHey @mfonville, 1st Buildpython setup.py test
4th Buildpython setup.py test
|
github/Requester.py
Outdated
@@ -251,23 +251,34 @@ def __init__(self, login_or_token, password, base_url, timeout, client_id, clien | |||
self.__userAgent = user_agent | |||
self.__apiPreview = api_preview | |||
|
|||
def requestJsonAndCheck(self, verb, url, parameters=None, headers=None, input=None, cnx=None): | |||
return self.__check(*self.requestJson(verb, url, parameters, headers, input, cnx)) | |||
def requestJsonAndCheck(self, verb, url, parameters=None, headers=None, input=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still some usage of requestJsonAndCheck()
that uses the cnx
argument, you might wanna refactor those as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked the source code but I can't find them anywhere else that I didn't refactor yet (and that was the status.github.com
stuff)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right on this
This PR is rather large, will it affect Github enterprise? |
Afaik there should be no impact on Github Enterprise, though I don't have any experience or testing with GH enterprise myself. The only major change in functionality is the actual important fix (that |
(and to be honest, if Github enterprise would start to offer a status-api and an uploads-api, this refactor will be a great first step to be able to implement that too) |
Could this PR be included in any upcoming release, so that we can use the stable branch without being confronted by the upload asset bug? |
@mfonville I am cool with that. Once it's merged I will cut a new release, together with the requests change. But please give me some time to review it, got busy with work recently. |
fyi, I am now using this code ( including #787 ) in production to upload more than 200 assets per day to Github |
I rebased the PR on the latest master (to accommodate for the Content-Length patch) |
Pity to see that it didn't make it into today's alpha release; what could I do to help to get this important fix included? Because asset uploading is broken with all current stable and alpha releases... (and I will rebase again on master to fix the merge conflict) |
Because this PR is touching some important underlying files (e.g. |
Ok, I can guarantee you that it is backward compatible, and I am using it successfully in production. The change is indeed a bit invasive, but that was based on the feedback for #768 . I will write some comments within the PR to highlight why I made the subsequent change, and why it won't break anything compared to the old situation. |
@@ -72,6 +72,7 @@ | |||
atLeastPython3 = sys.hexversion >= 0x03000000 | |||
|
|||
DEFAULT_BASE_URL = "https://api.github.com" | |||
DEFAULT_STATUS_URL = "https://status.github.com" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because requests for the Github API are send to a different domain than api.github.com
, we introduce here respective value for the status-api-url, instead of having it hardcoded in the other files (which it was previous to this commit)
@@ -625,8 +626,7 @@ def get_api_status(self): | |||
""" | |||
headers, attributes = self.__requester.requestJsonAndCheck( | |||
"GET", | |||
"/api/status.json", | |||
cnx="status" | |||
DEFAULT_STATUS_URL + "/api/status.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing here the full URL to get the status, instead of relying on passing the magic value status
as cnx
-value, which was really a bogus way of dealing with it. (The program should expect a Connection or None in cnx
and not a magic string).
The Requester-class will have the ability to deal with all urls, even the ones that do not go to the api.github.com
hostname
@@ -639,8 +639,7 @@ def get_last_api_status_message(self): | |||
""" | |||
headers, attributes = self.__requester.requestJsonAndCheck( | |||
"GET", | |||
"/api/last-message.json", | |||
cnx="status" | |||
DEFAULT_STATUS_URL + "/api/last-message.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing here the full URL, see comment above
@@ -653,8 +652,7 @@ def get_api_status_messages(self): | |||
""" | |||
headers, data = self.__requester.requestJsonAndCheck( | |||
"GET", | |||
"/api/messages.json", | |||
cnx="status" | |||
DEFAULT_STATUS_URL + "/api/messages.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing here the full URL, see comment above
def requestJsonAndCheck(self, verb, url, parameters=None, headers=None, input=None, cnx=None): | ||
return self.__check(*self.requestJson(verb, url, parameters, headers, input, cnx)) | ||
def requestJsonAndCheck(self, verb, url, parameters=None, headers=None, input=None): | ||
return self.__check(*self.requestJson(verb, url, parameters, headers, input, self.__customConnection(url))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we remove support for passing a magic-string as cnx
variable for the outer method.
But for the inner method, all requests can now pass a custom connection as a parameter;
the method __customConnection()
will parse the value of url
and check if the request is meant for https://api.github.com
or respective enterprise URL, which would mean it can re-use the existing API-connection and return None
;
but if the URL is to another hostname or port than the existing connection, it will initiate new custom connection to the host and pass it as an argument for further processing.
o = urlparse.urlparse(url) | ||
if o.hostname != self.__hostname or \ | ||
(o.port and o.port != self.__port) or \ | ||
(o.scheme != self.__scheme and not (o.scheme == "https" and self.__scheme == "http")): # issue80 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if either the port, the hostname or the http-scheme is different; but ignoring silent upgrades from http to https (that is issue 80)
@@ -303,7 +314,7 @@ def encode(input): | |||
|
|||
return self.__requestEncode(cnx, verb, url, parameters, headers, input, encode) | |||
|
|||
def requestMultipart(self, verb, url, parameters=None, headers=None, input=None): | |||
def requestMultipart(self, verb, url, parameters=None, headers=None, input=None, cnx=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow passing a custom connection
@@ -372,9 +383,6 @@ def __requestRaw(self, cnx, verb, url, requestHeaders, input): | |||
original_cnx = cnx | |||
if cnx is None: | |||
cnx = self.__createConnection() | |||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to check for the magic string value anymore, because we already get the proper custom connection to status.github.com
passed if we are connecting to status API
@@ -413,8 +421,8 @@ def __makeAbsoluteUrl(self, url): | |||
url = self.__prefix + url | |||
else: | |||
o = urlparse.urlparse(url) | |||
assert o.hostname in [self.__hostname, "uploads.github.com"], o.hostname | |||
assert o.path.startswith((self.__prefix, "/api/uploads")) | |||
assert o.hostname in [self.__hostname, "uploads.github.com", "status.github.com"], o.hostname |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert that the hostname can either be the existing API hostname, or to uploads or status API.
I'd prefer to not have this as hardcoded as it is here, but I don't want to overhaul even more; and the assertion at least helps in spotting coding errors.
assert o.hostname in [self.__hostname, "uploads.github.com"], o.hostname | ||
assert o.path.startswith((self.__prefix, "/api/uploads")) | ||
assert o.hostname in [self.__hostname, "uploads.github.com", "status.github.com"], o.hostname | ||
assert o.path.startswith((self.__prefix, "/api/")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We change this to /api/
so that the path can match the patterns of both the upload and status API
As you can see from my comments, my code makes in general compared to the old situation 2 real changes:
|
Thanks for all the comments. Preparing PyCon workshop this weekend, I will try to get this reviewed the hopefully merged soon by next week. |
In the old code the self.__hostname would be overwritten with uploads.github.com but it could not be correctly re-set to api.github.com after completing the upload Create a separate connection if hostname or port differ in requestblobandcheck in the end this became quite a large overhaul, to also make this change generic for e.g. connecting to status.github.com and all methods
take care of issue 80 (allow upgrade http to https for enterprise)
Uh, sorry for off-topic, but I just saw your project pushing automatic daily releases (daily Travis Cronjob?) that total around 10-20GiB each, moreover you have around 800 of these releases across 4 repositories, that's like 8TiB of build artifacts GitHub is storing, that's crazy! I'm surprised GitHub allows you to do this. Have you considered keeping only N latest releases (where N is <10), instead of keeping 200 latest daily releases dating back to September 2017? |
@nurupo pssst don't tell them ;-) I do the builds and uploads daily on a dedicated machine. The uploading used to be via a shellscript + GotHub(aka Github-Release) but I switched to PyGithub recently, because of issues with GitHub's servers and anti-DDoS measures. PyGithub proved to be a much more suitable and reliable solution :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@mfonville Merged, finally. Thanks for the awesome work! |
Great, good work everyone! |
Thanks! |
In the old code the self.__hostname would be overwritten with uploads.github.com
but it could not be correctly re-set to api.github.com after completing the upload
Create a separate connection if hostname or port differ in requestBlobAndCheck
in the end this became quite a large overhaul, to also make this change generic
for e.g. connecting to status.github.com and similar methods
not sure if tests need (more) updating, if so I will update the PR accordingly