-
Notifications
You must be signed in to change notification settings - Fork 573
Add "efs-dir" provsioning mode #1497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
8002947
to
8e6f341
Compare
/test pull-aws-efs-csi-driver-unit |
Please rebase |
8e6f341
to
025940c
Compare
Update: last force-push rebased PR (only). |
025940c
to
175644b
Compare
Update: last force-push fixed e2e tests for "efs-dir". |
/test pull-aws-efs-csi-driver-unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mpatlasov Thanks for the PR, we are considering supporting this feature. If could you address these comments and rebase, that would be great
docs/README.md
Outdated
| directoryPerms | | | false | Directory permissions for [Access Point root directory](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html#enforce-root-directory-access-point) creation. | | ||
| uid | | | true | POSIX user Id to be applied for [Access Point root directory](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html#enforce-root-directory-access-point) creation. | | ||
| gid | | | true | POSIX group Id to be applied for [Access Point root directory](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html#enforce-root-directory-access-point) creation. | | ||
| gidRangeStart | | 50000 | true | Start range of the POSIX group Id to be applied for [Access Point root directory](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html#enforce-root-directory-access-point) creation. Not used if uid/gid is set. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update storage class parameters documentation that are not supported for efs-dir mode to more clearly state they are only applicable to access point provisioning?
pkg/driver/provisioner_dir.go
Outdated
klog.V(5).Infof("Provisioning directory with permissions %s", perms) | ||
|
||
provisionedDirectory := path.Join(target, provisionedPath) | ||
err = d.osClient.MkDirAllWithPermsNoOwnership(provisionedDirectory, perms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add support for using static uid/gid from storage class?
I don't think dynamically selecting from a range as we do for access point provisioning makes sense here, but offering a static uid/gid option for the provisioned dir offers users an option who do not want permissive directory perms or to run their pods as root
pkg/driver/provisioner_dir.go
Outdated
if value, ok := volumeParams[AzName]; ok { | ||
azName = value | ||
} | ||
mountTarget, err := localCloud.DescribeMountTargets(ctx, fileSystemId, azName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we reduce the provision function to only call DescribeMountTargets
once instead of (max) 2?
Part of the advantage of using directory provisioning would be reducing API calls / chance of throttling, so we should try to limit these calls as much as possible.
We should be able to return the result from getMountOptions
and reuse this for mountOptions
and volContext
pkg/driver/provisioner_dir.go
Outdated
} else { | ||
// If it is nil then it's safe to try and delete the directory as it should now be empty | ||
klog.V(5).Infof("Deleting temporary directory at '%s'", target) | ||
if err := d.osClient.RemoveAll(target); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this Remove
instead of RemoveAll
as extra safety precaution to ensure we are not deleting any unintended data?
This is unlikely as UUID is appended to target
, but will ensure there is no accidental deletion if theres any other mounts in TempMountPathPrefix
Hi @mpatlasov Im interested to hear more about one of your concerns with current access point provisioning:
Is this referring to issues you have faced deleting access point root dirs outside of k8 or when deleting volumes with |
175644b
to
ad2a406
Compare
Hi @dankova22 , thank you for reviewing this PR, highly appreciated! I've just rebased it, and not addressed your comments yet. Will do it soon. |
This phrase in PR's description is copy/pasta from Jonathan Rainer's PR description. I thought it referred the issue #517:
I have to remove "user permissions around deleting provisioned directories" from the description, sorry for that confusion. |
/hold |
9cf3968
to
ad2a406
Compare
/test pull-aws-efs-csi-driver-e2e |
040821c
to
1f9e756
Compare
1f9e756
to
ff3ba10
Compare
/unhold |
Thanks again! Here is a patch for the documentation update: |
Thanks Dan for [the patch](kubernetes-sigs#1497 (comment)).
Hi @dankova22 , thanks for the doc patch, I've applied it now (sorry for delay - I was afk for last week). |
/lgtm |
Use `buildDriver()` instead of `&Driver{...}`
Controller's CreateVolume and DeleteVolume now call method of Provisioner interface. All heavy-lifting logic of create/delete specific for access-point provsioning is hidden now in AccessPointProvisioner struct.
Implement `DirectoryProvisioner` struct and its `Provision/Delete` methods. Add `delete-provisioned-dir` command-line option to control DeleteVolume behavior in efs-dir mode.
The patch adds new unit-tests for `provisioner_ap.go` and `provisioner_dir.go`, and also adds a few unit-tests for `controller.go`.
Two issues fixed: * If Unmount() failed, return error message with unmountErr * If Delete() failed before defer func, and defer func failed, return error message with both errors
Thanks Dan for [the patch](kubernetes-sigs#1497 (comment)).
c4cea00
to
36cd25a
Compare
New changes are detected. LGTM label has been removed. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mpatlasov The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Update: last force-push rebased PR (only). |
@mpatlasov: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This is not relevant to PR. |
This PR was inspired by Jonathan Rainer's PR, and Fabio Bertinatto's PR.
Is this a bug fix or adding new feature?
New feature, requested in #538 and #517 and possibly other issues, this comes up a lot as a feature people would like.
What is this PR about? / Why do we need it?
Since its creation the driver has supported Access Point Provisioning as its main means of dynamic provisioning. However this is problematic for a few reasons, it can cause issues with deleting access-points (as reported here) and it requires various AWS IAM Permissions that can be complicated to sort out and manage. Customers want a way to set their own permissions from the top level directory, but the mode doesn't allow to do that.
This PR allows a new provisioning mode called efs-dir so that instead of creating EFS access points, directories are created instead. This is achieved by creating a new Interface called Provisioner that is implemented by an AccessPointProvisioner (the original method) and a DirectoryProvisioner (the new method) this also allows in future for different kinds of provisioning to occur, maybe FileSystemProvisioning for example.
What testing is done?
New code is covered by unit-tests, they all pass when I run "go test" locally. New e2e tests are added covering create/delete operation in "efs-dir" mode. They also pass when I run them against OpenShift (OCP) cluster with new driver installed.
fixes #538