-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop on canary deployment using Gateway API and EKS #1732
Comments
Can you please try Flagger 1.39, we fixed a drift detection problem for Gateway API |
Thanks @stefanprodan. I have now upgraded but unfortunately still seem to have the same issue:
|
Any updates on this please @stefanprodan ? It does appear to be a similar bug to the one that was fixed in 1.39 in terms of the behaviour. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
I'm attempting to use Flagger on AWS EKS with the Gateway API making use of the AWS Gateway API Controller. I have followed the instructions in the tutorial at: https://docs.flagger.app/tutorials/gatewayapi-progressive-delivery but when triggering a canary deployment Flagger seems to get stuck in a loop of starting the canary deployment, changing the HTTRoute object weightings (in this case 10% to the canary, 90% to the primary) and then restarting the canary deployment, it never fails the canary after reaching the progress deadline timeout. It doesn't even appear to be getting to the rollout stage as the logs don't indicate the webhook ever running, however the pre-rollout check does run and succeed, but then runs again the next time round the loop. As an experiment I also disabled all metric checks as I don't think it is even getting as far as running them. Looking at the traffic weightings in AWS VPC Lattice I can see it alternating between the 90%/10% split and then briefly goes back up to 100%/0% before going back round the loop.
I have also tried setting skipAnalysis to true, which successfully promoted the canary, so the problem seems to be something to do with the analysis stage itself.
My canary configuration is as follows:
Flagger logs (in debug mode):
Any ideas on what might be going wrong?
To Reproduce
Expected behavior
Canary rollout progresses and succeeds
Additional context
The text was updated successfully, but these errors were encountered: