-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANN: Optimize host-side refine #1651
ANN: Optimize host-side refine #1651
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Artem for the PR, it looks good to me (apart from the issue with the unnecessary extra compile flag)!
On the performance note: the new code gives some ridiculous speed up of up to x500-x1000 in the case of small batches (n_query = 1), I'm not sure why. Also one cannot rely on the perf reporting in the current |
/merge |
Prior to this change, raft's host-side implementation of
raft::neighbors::refine
operation uses non-optimal OpenMP thread config by default, spawning as many threads as there are available cores, even if only one thread is used (per-query parallelism with batch size one).This change fixes that and adds a few optimizations alongside:
tree-optimize
compilation flag in the hopes compiles does the vectorization