-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle an interesting race condition (bug #1747367) #8385
Conversation
The static analysis showed that this might be a problem, and we're able to set up a case where it really does fail. Essentially, we have one loop which waits to try and send a notification of an event to users. While waiting for that event to be sent, it allows new requests to be made. So if we start watching for an event, but never handle when it triggers, and then queue up a bunch more watch events that cause us to reallocate the event slice, and *then* unwatch the original event, we will never notice that the original event is no longer watched because we'll be hung on an object that doesn't get updated.
!!build!! |
49500bf
to
ba26f5c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a subtle one.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! LGTM with one minor thought.
@@ -353,6 +353,11 @@ func (w *Watcher) flush() { | |||
return | |||
case req := <-w.request: | |||
w.handle(req) | |||
// handle may append to the w.pending array, and/or it may unwatch something that was previously pending |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISTM that we're already handling the unwatch case (that's the e.ch != nil
loop condition) but that the real issue here is the append, because the append can cause a realloc and thus e might be pointing to the old slice and hence be invalid.
I think I'd probably program this a little more defensively;
for {
// Note: w.pending can be appended to by w.handle so we can't
// safely keep the same e between iterations.
e := &w.pending[i]
if e.ch == nil {
break
}
select {
etc
?
Description of change
The static analysis showed that this might be a problem, and we're able to set
up a case where it really does fail.
Essentially, we have one loop which waits to try and send a notification of an
event to users. While waiting for that event to be sent, it allows new requests
to be made. So if we start watching for an event, but never handle when it
triggers, and then queue up a bunch more watch events that cause us to
reallocate the event slice, and then unwatch the original event, we will
never notice that the original event is no longer watched because we'll be hung
on an object that doesn't get updated.
QA steps
The test properly handles the error conditions. If you uncomment that one line in presence.go, it properly fails the test case (and fails after waiting LongWait).
Documentation changes
None.
Bug reference
lp:1747367