TPL and the case of the !@#(*@! hung process
So, here I am writing some really fun code, when I found out that I am running into dead locks in the code. I activate emergency protocols and went into deep debugging mode.
After being really through in figuring out several possible causes, I was still left with what is effectively a WTF @!(*!@ DAMN !(@*#!@* YOU !@*!@( outburst and a sudden longing for something to repeatedly hit.
Eventually, however, I figure out what was going on.
I have the following method: Aggregator.AggregateAsync(), inside which we have a call to the PulseAll method. That method will then go and execute the following code:
1: public void PulseAll()2: {
3: Interlocked.Increment(ref state);4: TaskCompletionSource<object> result;5: while (waiters.TryDequeue(out result))6: {
7: result.SetResult(null);8: }
9: }
After that, I return from the method. In another piece of the code (Aggregator.Dispose) I am waiting for the task that is running the AggregateAsync method to complete.
Nothing worked! It took me a while before I figured out that I wanted to check the stack, where I found this:
Basically, I had a dead lock because when I called SetResult on the completion source (which freed the Dispose code to run), I actually switched over to that task and allowed it to run. Still in the same thread, but in a different task, I run through the rest of the code and eventually got to the Aggregator.Dispose(). Now, I could only get to it if it the PulseAll() method was called. But, because we are on the same thread, that task hasn’t been completed yet!
In the end, I “solved” that by introducing a DisposeAsync() method, which allowed us to yield the thread, and then the AggregateAsync task was completed, and then we could move on.
But I am really not very happy about this. Any ideas about proper way to handle async & IDisposable?
Comments
Is it an option to force something to be on a dedicated thread? TaskCreationOptions.LongRunning would do that.
Ashic, But I don't really want that. It would force a complete new thread to be created, something quite expensive.
Judging by the code you posted, I see no compelling reason to have Dispose wait for the end of the background task, so I would simply remove _bg.Wait(). It may forces you to do some sanity checks on AddWork to prevent anyone from posting work to a disposed Runner.
For this type of pattern (AsyncRunner/Background worker) I recommend that dispose simply post a 'dispose' task/request and be done with it.
When you venture into the async world you must give up any desire to have controlled lifespan :-)
One approach I quite often use is have a single dedicated thread spinning on an in memory queue (think ring buffer style) and that carries out things on "a separate thread". You don't have one per item, just one long running separate thread.
Cyrille, Assume that Dispose need to actually dispose stuff, like close a file handle.
Ashic, That is a good idea, thanks.
Seems like a bit of re-inventing the wheel here. Why not just use an Rx Subject as your queue? Not sure what the point of Runner is, but you could pretty much get rid of it completely depending...
Otherwise, here's a gist of something that works without nearly as much mess: https://gist.github.com/OniBait/5807457
Don't use locks in TPL tasks. They just aren't designed to be used that way.
TPL Dataflow may have a solution for you. But instead of sending a pulse you use messages to tell different components to wake.
Another option is just going to normal background threads, raw or with an actor model.
Jeremy, Because that code is there merely to demonstrate the problem.
Let me rephrase my proposal then:
Move any cleanup code at the end of Background() (out of the while loop) and remove the _bg.Wait().
Cyrille, Doesn't work if we need to wait until the resources are disposed, then use them again. For example, we may want to delete the directory, but can't do that with opened files.
Comment preview