-
Notifications
You must be signed in to change notification settings - Fork 20.9k
core/state: lock-free subfetcher
#30782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards rejecting this PR.
Aside from the pretty large change in semantics, I also don't like the nested selects. IMO it's un-intuitive and hard to reason about.
// Wake signal sent | ||
default: | ||
// Wake signal not sent as a previous one is already queued | ||
case sf.tasks <- tasks: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now blocking. Previously, any number of threads could append to the sf.tasks
, and they could all exit afterwards, since they select was not blocking.
Now, instead, they all allocate their own slice of tasks, and each thread need to wait until it has been delivered over the sf.tasks
channel.
This seems to be quite a bit change in semantics, yet not something you mentioned on the PR description. I'm guessing it was unintentional. Unless I have misunderstood something, I think this highlights why we should reject this PR: it touches some very intricate parts of geth, which currently works fine. It's very easy to slip when making changes here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've responsed to this in main conversation thread.
I'm gonna put this back into draft while I try to address your concerns, @holiman . If I don't think I can get to something cleaner then I'll close it. |
Thank you for the thorough review and for the pushback @holiman—both are appreciated and made me completely rethink this. I won't object if you still want to reject the PR based on the slippery terrain, but I at least wanted to offer something better. There's now only a small (non-blocking) "fan-in" go routine and a channel I'll move this out of draft once I've thought it through a little more, but if you still want to close the PR then please do so I don't waste time.
Removed
Yes, it was. |
This is effectively the same thing as both wait on `<-sf.term` but it makes for a clearer test that demonstrates the bug because `terminate(false)` is documented as blocking until "all disk loads finish".
In redesigning following holiman's review, I found a race condition in the old implementation that this also fixes. It can be demonstrated by running |
IIUC, this race happens if new tasks are added while we're closing the thing. However, as far as I know, we add tasks while executing transactions in the block, and only close it after we're done with the execution. So, from a unit-testing perspective, it's a race, but given the context it's being used in, it's fine. Does this sound correct? |
@holiman I have rewritten the implementation a bit and the nested select has been removed. https://github.com/rjl493456442/go-ethereum/tree/arr4n/lockless-trie-subfetcher I am a bit hesitant. You are right the we will never invoke I would like to get some preliminary feedback about my changes first and then to decide whether changing the code is worthwhile. |
Yup, as you both think that scheduling and termination won't coincide then I agree that the race is theoretical. The test can definitely be excluded however I do think the rest of the changes benefit the code w.r.t clarity and general correctness of implementation vs expectation (that scheduled work is performed and
This is what motivated me to try find something cleaner. My first attempt wasn't great but I think the latest version (sans test) removes as much complexity as possible.
@rjl493456442 I see that your changes were on top of my rewrite that also removed the nested select. Was there something in particular that you wanted to improve on?
I find it a little difficult to reason about the replacement and checking of the |
Yep, nested select looks a bit hacky to me.
Yes it's true. At most one The done channel is used to not block the task buffering even when the process is running. |
So, let's imagine we invoke
As far as I can tell of the current state of this PR, in @ARR4N 's version,
So IMO that exhibits two behavioural changes that seem to be unintentional side-effects. Neither of those behavioural changes are good, IMO. As for @rjl493456442 's changes
IMO the changes in this PR are not acceptable, so I'll close this PR. The changes in @rjl493456442 's version are, I think, possibly acceptable. I'd rather discuss those in a separate PR, if you want to pursue it. |
Thanks for considering it and for the very thorough reviews.
I think you might have mixed up the original nested with the updated, non-nested version. I'm happy with @holiman's decision given the rationale so sharing this only in case you want to see an alternate approach, @rjl493456442. |
I think he referrred to this: Such towers are pretty non-intuitive, IMO. |
Trie prefetching with a
state.subfetcher
previously relied on a pattern of {lock, set tasks, unlock, send void onwake
channel}, which I've collapsed into a single channel send. This allows the lock to be removed entirely.The old implementation had a subtle race condition between
schedule()
andterminate()
(demonstrated in a test), which is fixed by this new pattern. There was a guarantee of task handoff but not of receipt byloop()
because of thewake
channel's buffering and deliberate dropping of excess sends.The new
tasks
channel is never closed, just like the oldwake
one wasn't. If that needs to be addressed I think it's best done in another PR.I know that #30629 made significant allocation improvements and I don't want to inadvertently make things worse, but I can't find the benchmarks that were used there. Happy to run them if someone can please point me in the right direction.
Note
The code changed significantly after the first review.
See Go Blog: Share Memory By Communicating for background.