-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cagra template instantiations #1650
Cagra template instantiations #1650
Conversation
@@ -319,6 +319,14 @@ if(BUILD_TESTS) | |||
test/neighbors/ann_cagra/test_int8_t_uint32_t.cu | |||
test/neighbors/ann_cagra/test_uint8_t_uint32_t.cu | |||
test/neighbors/ann_cagra/test_float_int64_t.cu | |||
src/neighbors/detail/cagra/search_multi_cta_float_uint64_dim128_t8.cu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: to libraft.so we add precompiled cagra kernels for 32-bit index type only. We have test for 64-bit as well, therefore we add extra object files for the tests here.
d3300bd
to
816bc25
Compare
816bc25
to
5e74e7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good to me! Just want to make sure the compile time decrease/binary size increase is roughly equivalent to #1428 before I approve this since we add another templated function in this PR.
This PR adds 150 MB to libraft.so. Status before the PR:
After the PR
Changes in Cagra prims bench are not yet measured, expected to be similar to cagra ann bench. |
The binary size will be reduced once #1459 is solved. Here is a list of added compilation units:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good enough in its current state to approve for 23.08. I do think we should continue to work on the binary size and build times, but that's also a longer-term need.
#include "../ann_cagra.cuh" | ||
#include "search_kernel_uint64_t.cuh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are already testing for uint32_t
here but int64_t
has become the "standard" indexing type across RAFT's ANN algos. Could we test int64_t
here just so we're covering both unsigned and signed types here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I can make the change in a follow up PR. Under the hood IdxT is converted to unsigned type, so it is only a matter of testing the outer wrapper layers.
/merge |
Cagra was introduced header only in #1375. This PR adds a precompiled single- and multi-cta search kernels to libraft.so.
The single- and multi-cta search kernels were moved to separate header files to make it easier to specify extern template instantiations for these.
The macros for dispatching the kernels were replaced by functions. We define explicit instantiations for the top level dispatch functions. (This is in contrast to #1428 where the kernels themselves were instantiated, which resulted in a large number of parameter combinations that had to be explicitly spelled out.)
This PR fixes #1443.