Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add eps-neighbor search via RBC #2028

Merged
merged 30 commits into from
Jan 24, 2024
Merged

Conversation

mfoerste4
Copy link
Collaborator

@mfoerste4 mfoerste4 commented Nov 29, 2023

This PR adds support for an eps-neighborhood search based on random ball cover. The algorithm re-uses the existing rbc index creation.

Changes:

  • add C++ API raft::neighbors::ball_cover::epsUnexpL2NeighborhoodRbc
  • lifted the 2/3-D limitation for eps-neighborhod via RBC (limitation still in place for k-nn queries)
  • pylibraft support for dense brute-force eps-neighborhood
  • pylibraft support for sparse/dense rbc epsneighborhood

Note: The PR also contains a fix for the vertex degree computation in the brute force algorithm spatial::knn::detail::epsUnexpL2SqNeighborhood.

Related to #1984 and #517

@mfoerste4 mfoerste4 requested a review from a team as a code owner November 29, 2023 22:35
@github-actions github-actions bot added the cpp label Nov 29, 2023
@mfoerste4
Copy link
Collaborator Author

FYI, @cjnolet , @tfeher

@cjnolet
Copy link
Member

cjnolet commented Nov 30, 2023

lifted the 2/3-D limitation for RBC (only valid for k-nn)

This is awesome! So can we support arbitrarily large dimensions now or are we still limited to what can be fused into the knn with k-selection?

(I admit I haven't skimmed the whole PR yet)

@mfoerste4
Copy link
Collaborator Author

lifted the 2/3-D limitation for RBC (only valid for k-nn)

This is awesome! So can we support arbitrarily large dimensions now or are we still limited to what can be fused into the knn with k-selection?

(I admit I haven't skimmed the whole PR yet)

Sorry, my formulation was a bit unclear. 2/3-D limitation still holds for k-nn queries, but it is removed for eps-nn.

@mfoerste4 mfoerste4 changed the title [WIP] Add eps-neighbor search via RBC Add eps-neighbor search via RBC Dec 27, 2023
@mfoerste4
Copy link
Collaborator Author

@cjnolet , I believe this is ready for review now.

@cjnolet cjnolet added feature request New feature or request non-breaking Non-breaking change labels Dec 28, 2023
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes @mfoerste4. This is a large PR, though, so I'm going to take a couple different passes. I went through the general API hanges first. I'll look through the algorithmic changes more closely but I wanted to provide some initial feedback before it gets too late for 24.02.

@@ -67,6 +67,28 @@ void knn_query(raft::resources const& handle,
bool perform_post_filtering = true,
float weight = 1.0) RAFT_EXPLICIT;

template <typename idx_t, typename value_t, typename int_t, typename matrix_idx_t>
void epsUnexpL2NeighborhoodRbc(raft::resources const& handle,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an opportunity here to choose a nicer and more user-friendly name. I'd prefer something like "eps_nn" and we use knn to designate the k-nearest neighbors, since we are already in the ball_cover namespace. Please also use mdspan-based APIs for these. The pointer-based functions are deprecated in the public APIs, so we shouldn't be introducing any new ones.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I have changed the API accordingly.

cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft_runtime/neighbors/eps_neighborhood.hpp Outdated Show resolved Hide resolved
cpp/src/neighbors/ball_cover.cu Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/ball_cover-ext.cuh Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/ball_cover-inl.cuh Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/ball_cover-inl.cuh Outdated Show resolved Hide resolved
template <typename value_idx,
typename value_t,
typename value_int = std::uint32_t,
typename matrix_idx = std::uint32_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too wrt new template args. If we're defaulting it to be the same as value_int, can we just reuse it?

cpp/include/raft/spatial/knn/detail/haversine_distance.cuh Outdated Show resolved Hide resolved
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Malte for this work! The main issue that I see actually coincides with what @cjnolet has already pointed out: we would like to simplify the template parameters and reduce the number of instantiations.

I have started to play with the idea of keeping all idx in int64_t in a separate branch. This commit fixes the compilation issues that we have in the CI, but some issues still remain with the test.

I am still looking into some of the implementation details, I will get back to you soon.

cpp/src/neighbors/ball_cover.cu Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/ball_cover-inl.cuh Outdated Show resolved Hide resolved
cpp/include/raft/neighbors/ball_cover-ext.cuh Outdated Show resolved Hide resolved
@mfoerste4
Copy link
Collaborator Author

@cjnolet , @tfeher , thanks for your review. I finally managed to get rid of the #undef RAFT_EXPLICIT_INSTANTIATE_ONLY. Please have a look at the changes based on your suggestions. @cjnolet , please comment on the pylibraft API here so I can adjust it accordingly.

@mfoerste4 mfoerste4 requested review from cjnolet and tfeher January 19, 2024 08:03
@mfoerste4
Copy link
Collaborator Author

@cjnolet , I have removed all double precision instantiations to reduce compile time and binary size. The double-path was only partly used via cuml-dbscan but caused a 2x factor for all major rbc and knn instances.

Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mfoerste4 for the update, looks good now, just a few small things:

cpp/include/raft/neighbors/ball_cover-inl.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/include/raft/spatial/knn/detail/ball_cover.cuh Outdated Show resolved Hide resolved
cpp/test/neighbors/epsilon_neighborhood.cu Outdated Show resolved Hide resolved
cpp/test/neighbors/epsilon_neighborhood.cu Outdated Show resolved Hide resolved
@mfoerste4 mfoerste4 requested a review from tfeher January 23, 2024 12:24
Copy link
Contributor

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mfoerste4 for the update, the PR looks good to me!

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR LGTM! Thanks again @mfoerste4!

@cjnolet
Copy link
Member

cjnolet commented Jan 23, 2024

/merge

@mfoerste4
Copy link
Collaborator Author

@cjnolet , sorry for the last minute fix - just noticed this while building with cuml.

@rapids-bot rapids-bot bot merged commit ac2d1ae into rapidsai:branch-24.02 Jan 24, 2024
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CMake cpp feature request New feature or request non-breaking Non-breaking change python
Projects
Development

Successfully merging this pull request may close these issues.

3 participants