-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] S3 fixes and improvements #101
Conversation
git-lfs as of v2.5.0 [1] detects the file Content-Type and provides it during PUT requests to the storage when using the basic transfer. If we don't include the same `Content-Type` value in the S3 signature generation, then S3 rejects it during upload. Unfortunately git-lfs doesn't pass the `Content-Type` value to the Batch request, so it can't be included in S3 signature generation. And while setting `lfs.contenttype=false` in the client config forces sending `application/octet-stream` [2], S3 isn't happy with that either. If the Batch response upload action includes a `Content-Type` header then git-lfs will always use that for the PUT request, so always return `application/octet-stream`, and also use it for signature generation. [1] git-lfs/git-lfs#3137 [2] git-lfs/git-lfs#3163
- move dev-requirements to "latest" - upgrade reequirements sensibly - fix minor azure typing issues - azure test fixes wrt URL changes (/bucket/prefix%2Frepo%2Forg%2Fsha -> /bucket/prefix/repo/org/sha)
Use the new S3 content integrity support [1] to pass the object sha256 to S3 when uploading objects, so S3 can confirm integrity. [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html
@rcoup did it work for you for s3_external? I've tried with 'stock' giftless and with your patches applied, I'm still getting rejected by S3. Would you be interested in looking at it? |
@mtb-xt yes, is working well for me. What errors are you seeing? Using something like this should show all the requests & most of the responses that are happening, for both git itself and git-lfs' interaction with the git-lfs server & any storage endpoints:
FWIW, I'm predominantly using EC2 instance roles with IAM roles/policies for bucket access; and git-lfs v2.13.3 |
@rcoup hey, thanks for helping out! I'm getting 400 Bad Requests, which lead me to believe some of the headers are wrong -_- Here are some sanitized logs -> https://gist.github.com/mtb-xt/7c8641535aa615e6f4eeea24fc4c2285 |
Aha, this weird issue was that my bucket had KMS encryption key managed by AWS, and that requires using SigV4 signatures in presigned urls. |
Argh, their docs say that SigV4 is now the default, but apparently not |
@rufuspollock @shevron are you guys going to accept this PR? There are some more fixes needed for S3, but I'd like to start off from changes in this branch |
@rcoup first a big thank-you for the contribution 👍 . Really welcome. This looked good and as per @mtb-xt i have merged this. I did want to check that: a) the PR was in good state to merge (from review and discussion seemed so but was not sure from your last comments). I do notice build tests are failing for what seems like a dependency issue https://github.com/datopian/giftless/runs/6138620405?check_suite_focus=true |
@rufuspollock |
@mtb-xt CC me in your PR please? |
Dependency fixes from #101
A series of S3 improvements. Happy to break them up if you want, but more after feedback at this point :-)
s3_external: force the upload Content-Type
git-lfs as of v2.5.0 detects the file Content-Type and provides it during PUT requests to the storage when using the basic transfer. If we don't include the same
Content-Type
value in the S3 signature generation, then S3 rejects the upload.Unfortunately git-lfs doesn't pass the
Content-Type
value to the Batch request, so we can't include it in the S3 signature generation. And while settinglfs.contenttype=false
in the client config forces sendingapplication/octet-stream
, S3 isn't happy with that either (IIUC it defaults tobinary/octet-stream
or expectsContent-Type
to be omitted).But if the Batch response upload action includes a
Content-Type
header then git-lfs will always use that for the PUT request. So change Giftless to returnapplication/octet-stream
, and also use it for signature generation.s3_external: use sha256 checksums when uploading
Use the new S3 content integrity support to pass the object sha256 to S3 when uploading objects, so S3 can confirm integrity.
This requires a newer boto3 version.
tests: s3: force us-east-1 region to match urls
If you run the tests with a different default region set in your local S3 config, you get failures. Fix it by forcing to
us-east-1
.If you're reasonably happy I can either t