[JUJU-484] Stop polling external cmr controller when it is removed #13646
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For offers host on a different controller, the consuming side needs to poll the controller to keep up to date with any API address changes. However, we weren't stopping the poll operation when all consuming apps were removed. And if the external controller was killed, the polling would fill the logs with errors.
This PR adds ref counting so that we track if an external controller record is still needed. Once all SAAS applications are removed, the external controller record is also deleted. A watcher in the polling worker causes the worker stop. To enabled the ref counting, the SAAS application gets a new attribute set to the external controller UUID.
An upgrade step is added to set the new external controller UUID attribute on any relevant SAAS applications, and also remove any orphaned external controller records.
In the polling worker, the external controller connection is now done asynchronously to avoid blocking the worker from shutting down if the external controller is removed while the connection is being made. Error handling is also fixed to properly account for not found etc.
Another fix is that getting the api addresses for the external controller was incorrectly including an empty address if there was no public DNS name recorded.
QA steps
bootstrap two earlier 2.9 controllers with logging for
juju.worker.externalcontrollerupdater
set to DEBUGcreate a cross controller relation
upgrade the controllers
use mongo shell to check that the
source-countroller-uuid
is set for the remote application record on the consuming sideon the consuming side, run
juju remove-saas
to delete the consuming applicationcheck log for
stopping watcher for external controller ...
run
juju consume
to consume the offer againcheck log for
starting watcher for external controller ...
Bug reference
https://bugs.launchpad.net/juju/+bug/1958446