-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support uint64_t in CAGRA index data type #1514
Conversation
Currently, some unit tests fail because recalls of them are smaller than the threshold. |
TODO:
|
Fixed the low recall problem and did the to-do things but found that this update is not necessary to implement the RAFT RNN benchmark for CAGRA (:cry:). However, I will proceed with this PR to ensure the completeness of the index data type. |
@tfeher Can you check the code and let me know if there are any problems? |
Thanks @enp1s0 for this PR! I think it is very useful to update the code to work with other index type (even if we would mainly use with 32bit indices). I will have a look at the changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @enp1s0 for this PR. It is great to have the freedom to customize the index type paramater. I am somewhat concerned about the test size and compile time maybe it would be sufficient to test limited template type combinations in int64 mode? (In practice it is expected that uint32_t
would be sufficient in most of the cases.)
cpp/include/raft/neighbors/detail/cagra/search_multi_kernel.cuh
Outdated
Show resolved
Hide resolved
cpp/test/CMakeLists.txt
Outdated
test/neighbors/ann_cagra/test_float_uint64_t.cu | ||
test/neighbors/ann_cagra/test_int8_t_uint64_t.cu | ||
test/neighbors/ann_cagra/test_uint8_t_uint64_t.cu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding these effectively doubles our test binary size:
File | Compile time | Size |
---|---|---|
test_float_uint32_t.cu.o | 43:55 min | 106.657 MB |
test_float_uint64_t.cu.o | 42:24 min | 98.597 MB |
test_uint8_t_uint32_t.cu.o | 41:31 min | 117.630 MB |
test_int8_t_uint32_t.cu.o | 41:29 min | 116.698 MB |
test_int8_t_uint64_t.cu.o | 40:04 min | 106.032 MB |
test_uint8_t_uint64_t.cu.o | 39:56 min | 107.312 MB |
See #1499 (comment) for potential impact on packages.
(Note that compile time is improved by #1428 by enabling parallel build for different template parameters, but binary size is not changed there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed test_int8_t_uint64_t
and test_uint8_t_uint64_t
Thanks @tfeher for reviewing the changes.
Yes, I agree with you. I'll remove the ( |
@tfeher I have a question about supporting signed 64-bit int ( While it is possible to modify the kernels to support
This implementation should not have any disadvantages because, if the data type is signed, the index is always positive, and the range of representation for unsigned types is larger than that of signed types. What do you think about it? |
This sounds good to me! |
@tfeher I have fixed the codes! Can you look at them again? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @enp1s0 for the updates, the PR looks good to me!
/merge |
This PR updates CAGRA::search to support
uint64_t
in the index data type. This update is required to implement the RAFT ANN benchmark for CAGRA.