Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle the case where the state pool returns NotFound error. #10321

Merged
merged 4 commits into from
Jun 13, 2019

Conversation

howbazaar
Copy link
Contributor

This happens if a model is dying and being removed from the pool, but isn't yet removed.

Bug reference

Partial fix for https://bugs.launchpad.net/juju/+bug/1832294

Copy link
Contributor

@babbageclunk babbageclunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @manadart!

@@ -3889,7 +3909,7 @@ func (tw *testWatcher) AssertNoChange(c *gc.C) {
func (tw *testWatcher) AssertChanges(c *gc.C, expected int) {
var count int
tw.st.StartSync()
maxWait := tw.st.clock().After(testing.LongWait)
maxWait := time.After(testing.LongWait)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, oops!

// The entity's model is gone so remove the entity
// from the store.
doc.removed(all, modelUUID, id, nil)
// The state pool will return a not found error if the model is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, that's a nasty place to miss a not.

jujubot added a commit that referenced this pull request Jun 13, 2019
#10322

If the watcher backing the cache worker is shut down cleanly, that worker exists - for any other error, a new watcher is created and the worker proceeds.

This branch includes the same patch as #10321 and supersedes #10319.

The code is all taken from #10319 and a test to the allwatcher added. Along with a drive by fix for a max wait time. From previous PR:

This patch does 2 main things:
 - If a model is removed, it prevents a running all-watcher from closing.
 - If the watcher backing the cache worker is shut down cleanly, that worker exists - for any other error, a new watcher is created and the worker proceeds.

## QA steps

Reproduce:
 - With 2.6.3, bootstrap.
 - Run juju debug log -m controller in a separate terminal.
 - juju destroy-model default -y and observe the "model not found" errors.

Confirm:
 - Same as above and observe that the cache does not bounce, and the "not found" errors are gone.

## Bug reference

https://bugs.launchpad.net/juju/+bug/1832294
@manadart
Copy link
Member

$$merge$$

@manadart
Copy link
Member

@jujubot jujubot merged commit bbb4db7 into juju:2.5 Jun 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants