-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes issue where PVCs using standard
StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters
#52322
Fixes issue where PVCs using standard
StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters
#52322
Conversation
/release-note-none |
2ff6383
to
861ca18
Compare
|
||
for _, node := range nodeList.Items { | ||
labels := node.ObjectMeta.Labels | ||
zone := labels[kubeletapis.LabelZoneFailureDomain] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for label doesn't exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
test/e2e/ubernetes_lite.go
Outdated
var extraZone string | ||
for _, zone := range allZonesInRegion { | ||
if !expectedZones.Has(zone.Name) { | ||
extraZone = zone.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can break after this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
test/e2e/ubernetes_lite.go
Outdated
|
||
// If no zones left to create an extra instance we screwed and we stop the test right here | ||
Expect(extraZone).NotTo(Equal(""), fmt.Sprintf("No extra zones available in region %s", region)) | ||
framework.Logf("We are going to start a compute instance in unused zone: %v\n", extraZone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use By() here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
test/e2e/ubernetes_lite.go
Outdated
gotZones, err := gceCloud.GetAllCurrentZones() | ||
Expect(err).NotTo(HaveOccurred()) | ||
|
||
Expect(gotZones.Equal(expectedZones)).To(BeTrue(), fmt.Sprintf("Expected zones: %v, Got Zones: %v", expectedZones, gotZones)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to sort the zones before comparing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is set equality. It does a carnality comparison then checks whether one is a superset of the other.
test/e2e/ubernetes_lite.go
Outdated
func OnlyAllowNodeZones(f *framework.Framework, zoneCount int, image string) { | ||
gceCloud, err := framework.GetGCECloud() | ||
Expect(err).NotTo(HaveOccurred()) | ||
gceCloud.SetClientSet(f.ClientSet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to set this earlier in e2e.go when we initialize the cloud provider?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not without pretty significant code changes I think. I looked at the way the framework is setting it up for f.ClientSet and its pretty involved and has some pretty complex dependancies. Also f.ClientSet isn't set up until the start of the specific test so we can't use it in e2e.go.
lmk if you want to talk about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think i've actually solved this. Testing now.
861ca18
to
3524496
Compare
/assign |
test/e2e/e2e.go
Outdated
if cloudConfig.Zone == "" && framework.TestContext.CloudConfig.MultiZone { | ||
zones, err := gceCloud.GetAllZones() | ||
zones, err := gceCloud.GetAllCurrentZones() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this going to work? This is done early in e2e initialization, where client is not set yet?
zones := sets.NewString() | ||
nodeList, err := gce.client.Core().Nodes().List(metav1.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check for client nil
test/e2e/ubernetes_lite.go
Outdated
@@ -61,8 +62,87 @@ var _ = framework.KubeDescribe("Multi-AZ Clusters", func() { | |||
It("should schedule pods in the same zones as statically provisioned PVs", func() { | |||
PodsUseStaticPVsOrFail(f, (2*zoneCount)+1, image) | |||
}) | |||
|
|||
It("should only be allowed to provision PDs in zones where nodes exist", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also control the zones that volumes are provisioned in, if we name them correctly. So maybe a test case we can add is to create PVCs with specific names, and make sure they get spread across the appropriate zones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this test case be added in this same PR or should I create a new PR/issue for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put it as part of this test, maybe instead of (or in addition to) comparing the GetAllZones call. Because from a user's standpoint, they would see issues in the volume provisioning. The GetAllZones call is just an implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it in addition to. So the GetAllZones check still exists, the PVC checking follows.
3524496
to
547f0ec
Compare
addressed comments |
/retest |
} | ||
nodeList, err := client.Core().Nodes().List(metav1.ListOptions{}) | ||
if err != nil { | ||
log.Fatalf("Failed to list nodes: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can let the caller log the error if they want to.
zones.Insert(zone) | ||
} else { | ||
return nil, fmt.Errorf("Could not find zone for node %v", node.ObjectMeta.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should log and skip. I don't want to fail the whole provisioning operation just because one of the nodes is in a bad state.
return zones, nil | ||
} | ||
|
||
// GetAllManagedZones gets all the zones managed by k8s | ||
func (gce *GCECloud) GetAllManagedZones() (sets.String, error) { | ||
if gce.managedZones != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When is this field set? Is it during initialization and never changes after that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it set from the config and doesn't ever change after. I feel like the zones returned by this method could be different from GetAllCurrentZones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how useful "zones when initialized" is considering you can add/remove zones during runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - that was my point when I suggested removing it. It seems @davidz627 agreed for removing this function completely in another comment I added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something that confuses me. There are a few places that iterate through managedZones, but if managedZones never changed after initialization, how does it get new zones that are added later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like all other uses of managedZones makes sense to me. Just this one seems to be related to the issue this PR is trying to solve. Can anyone confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's checking that the zone passed in is in the managedZones list, but maybe it makes more sense to compare against GetAllCurrentZones. The actual selection of the zone is done by ChooseZoneForVolumes, which calls GetAllCurrentZones already. But it's also possible for the user to explicitly set an invalid zone as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saad-ali we were wondering if we should restrict both CreateDisk and CreateRegionalDisk to only being able to create disks in zones with nodes in them (a la GetAllCurrentZones). Or if they should be left in the current state (to allow users to create disks in any zone in the region).
In its current state we believe that a user trying to create a Disk in a zone without a node would be most likely a mistake (since the PD would be unusable) and should trigger an error. However, it could be possible that advanced users may want to be able to create PDs in zones without nodes for regionaldisks for manual failover.
What do you think, Saad? It seems to be a tradeoff between failing fast for standard use-cases, or allowing users to do hypothetical funky things manually that we may not want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW @msau42 and I think surfacing an error for provisioning a PD in a zone without nodes is the way to go here. But I am interested to see if this may have some unintended side effects or if we missed some other use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saad-ali we were wondering if we should restrict both CreateDisk and CreateRegionalDisk to only being able to create disks in zones with nodes in them (a la GetAllCurrentZones). Or if they should be left in the current state (to allow users to create disks in any zone in the region).
In its current state we believe that a user trying to create a Disk in a zone without a node would be most likely a mistake (since the PD would be unusable) and should trigger an error. However, it could be possible that advanced users may want to be able to create PDs in zones without nodes for regionaldisks for manual failover.
What do you think, Saad? It seems to be a tradeoff between failing fast for standard use-cases, or allowing users to do hypothetical funky things manually that we may not want.
FWIW @msau42 and I think surfacing an error for provisioning a PD in a zone without nodes is the way to go here. But I am interested to see if this may have some unintended side effects or if we missed some other use cases.
I agree, provisioning a volume in a zone that does not have any active nodes should result in an error.
test/e2e/ubernetes_lite.go
Outdated
expectedZones, err := gceCloud.GetAllCurrentZones() | ||
Expect(err).NotTo(HaveOccurred()) | ||
framework.Logf("Expected zones: %v\n", expectedZones) | ||
// Get all the zones in this current region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add more newlines between operations for better readability?
test/e2e/e2e.go
Outdated
if err != nil { | ||
glog.Fatal("Error loading client: ", err) | ||
} | ||
gceCloud.SetClientSet(cs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a production environment, when is clientset set? Can we add it to CloudConfig instead of a new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be set using a method "Initialize" in gce.go. It's the line "gce.client = clientBuilder.ClientOrDie("cloud-provider")".
Looks like in production code this is done separately from creating the gceCloud object itself because it is an interface method and is called for any of the many cloud objects. It is called here:
func StartControllers(s *options.CloudControllerManagerServer, kubeconfig *restclient.Config, clientBuilder controller.ControllerClientBuilder, stop <-chan struct{}, recorder record.EventRecorder, cloud cloudprovider.Interface) error { |
Additionally I'm not sure that the clientset is a "config" item. In production code the cloudconfig is generated from reading in a config file and the clientset doesn't really seem to fit into that paradigm.
Setting the clientset with a method (although not the same method) after it is created is consistent with how it works in the production code. I think adding it to the CloudConfig for the e2e tests would actually make it inconsistent with the production code. I am open to adding it to the CloudConfig if you still think it is the right move, just wanted to provide you with additional info before we make a decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok that's fine then. Thanks for the detailed explanation!
@@ -422,7 +422,7 @@ var _ = SIGDescribe("Dynamic Provisioning", func() { | |||
Expect(err).NotTo(HaveOccurred()) | |||
|
|||
// Get all k8s managed zones | |||
managedZones, err = gceCloud.GetAllZones() | |||
managedZones, err = gceCloud.GetAllManagedZones() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we can't have "GetAllCurrentZones() here?
return zones, nil | ||
} | ||
|
||
// GetAllManagedZones gets all the zones managed by k8s | ||
func (gce *GCECloud) GetAllManagedZones() (sets.String, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this method? I would really like to get rid of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently there is only one place in test using this method and I personally think that we can use GetAllCurrentZones() there too.
Note that currently "managed_zones" can be only set to two values:
- one particular zones
- all zones from the region
There is no way to set it differently in production, thus we probably shouldn't set it differently in tests too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I have removed this method and replaced the one use with GetAllCurrentZones
// TODO: Caching, but this is currently only called when we are creating a volume, | ||
// which is a relatively infrequent operation, and this is only # zones API calls | ||
// GetAllCurrentZones returns all the zones in which nodes are currently running | ||
func (gce *GCECloud) GetAllCurrentZones() (sets.String, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a TODO to add some caching to it?
If there will be too many calls to it, it may overload apiaserver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we would like to make it watch-based.
All comments addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment. Also please squash commits
Other than that LGTM
@@ -421,8 +421,8 @@ var _ = SIGDescribe("Dynamic Provisioning", func() { | |||
gceCloud, err := framework.GetGCECloud() | |||
Expect(err).NotTo(HaveOccurred()) | |||
|
|||
// Get all k8s managed zones | |||
managedZones, err = gceCloud.GetAllZones() | |||
// Get all k8s managed zones (in this case it is just the single zone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's actually true. If you will be running this test for multizone clusters, these actually can be more than one zone.
Lots of unit test issues because of change to CreateDisk. Have to modify many of them so that may need to be looked at later. |
comment addressed. Modified unit tests and level of indirection for GetAllCurrentZones. Please review new commit changes. |
62a72e5
to
a24264e
Compare
@@ -37,7 +38,8 @@ func TestCreateDisk_Basic(t *testing.T) { | |||
/* Arrange */ | |||
gceProjectId := "test-project" | |||
gceRegion := "fake-region" | |||
fakeManager := newFakeManager(gceProjectId, gceRegion) | |||
zonesWithNodes := []string{"zone1"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the managedZones setting in these tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
many of the tests rely on "managed zones", especially the ones that delete instances. I can remove them on the ones that don't require doing a delete if you think that would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nah, I'm just trying to see if there's a way we can avoid duplicate definitions. Maybe can you change the managedZones to use zonesWithNodes?
a24264e
to
38fd65b
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidz627, msau42, saad-ali, wojtek-t Associated issue: 50115 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
[MILESTONENOTIFIER] Milestone Pull Request Current @davidz627 @foxish @krousey @msau42 @saad-ali Note: If this pull request is not resolved or labeled as Pull Request Labels
|
/retest |
/retest Review the full test history for this PR. |
@davidz627: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest Review the full test history for this PR. |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here. |
Added cherrypick-candidate label - this should be cherrypicked to 1.7 and 1.8 branches. |
@davidz627 - I wanted to do automated cherrypick of it to 1.8 branch but it generated conflicts. Can you please create this cherrypick - I think it's better since you wrote this code. |
Thanks @davidz627 ! |
…322-upstream-release-1.8 Automatic merge from submit-queue. Automated cherry pick of #52322 upstream release 1.8 Automated cherry pick of #52322 upstream release 1.8 #52322: Fixes issue where PVCs using `standard` StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters ```release-note Fix a bug in GCE multizonal clusters where PersistentVolumes were sometimes created in zones without nodes. ```
…322-upstream-release-1.7 Automatic merge from submit-queue. Automated cherry pick of #52322 upstream release 1.7 Cherry pick of #52322 on release-1.7. #52322: Fixes issue where PVCs using `standard` StorageClass create PDs in disks in wrong zone in multi-zone GKE clusters ```release-note Fix a bug in GCE multizonal clusters where PersistentVolumes were sometimes created in zones without nodes. ```
Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
Fixes #50115
Changed GetAllZones to only get zones with nodes that are currently running (renamed to GetAllCurrentZones). Added E2E test to confirm this behavior.