-
Notifications
You must be signed in to change notification settings - Fork 40.6k
fix azure retry issue when return 2XX with error #78298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andyzhangx The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
fix comments
95497e0
to
8a45ba1
Compare
BTW, this is a fundamental fix for all error retry issues on all ARM calls by azure cloud provider, though I found this issue on VMSS. We should not regard 200 return code as successful operation since it also may return an error code per https://github.com/Azure/azure-resource-manager-rpc/blob/master/v1.0/Addendum.md#creating-or-updating-resources and also from ARM team. |
/test pull-kubernetes-e2e-gce |
/hold cancel |
/test pull-kubernetes-e2e-aks-engine-azure |
pull-kubernetes-e2e-aks-engine-azure is failing because of Azure/aks-engine#1368. /lgtm |
/skip pull-kubernetes-e2e-aks-engine-azure |
/retest Review the full test history for this PR. Silence the bot with an |
2 similar comments
/retest Review the full test history for this PR. Silence the bot with an |
/retest Review the full test history for this PR. Silence the bot with an |
/retest Review the full test history for this PR. Silence the bot with an |
@andyzhangx: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
…8298-upstream-release-1.13 Automated cherry pick of #78298: fix azure retry issue when return 2XX with error
…8298-upstream-release-1.14 Automated cherry pick of #78298: fix azure retry issue when return 2XX with error
…8298-upstream-release-1.12 Automated cherry pick of #78298: fix azure retry issue when return 2XX with error
What type of PR is this?
/kind bug
What this PR does / why we need it:
With this PR, when azure API call return <200, error>, it would regard as error and continue retry.
As we could find in error case, and also talked with VMSS team, return 2XX does not mean the operation is successful(doc: https://github.com/Azure/azure-resource-manager-rpc/blob/master/v1.0/Addendum.md#creating-or-updating-resources), there could also be error condition, e.g.
httpStatusCode 200
resultCode NetworkingInternalOperationError
Which issue(s) this PR fixes:
Fixes #78172
Special notes for your reviewer:
Note: this code change will affect all error retry including following places:
Does this PR introduce a user-facing change?:
/hold
Let's hold on for VMSS team do the final confirmation and also code review.
/kind bug
/assign @feiskyer
/priority important-soon
/sig azure