Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization of IVF Flat and IVF PQ #919

Merged
merged 9 commits into from
Jan 3, 2023

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Oct 14, 2022

This PR implements serialization to file for ivf_pq::index and ivf_flat::index structures.

Index building takes time, therefore downstream projects (like cuML) want to save the index (rapidsai/cuml#4743). But downstream project should not depend on the implementation details of the index, therefore RAFT provides methods to serialize and deserialize the index.

This is still experimental:

  • ideally we want to use a general serialization method for mdspan Serialization for commonly used types #770,
  • instead of directly saving to file, raft should provide a byte string and let the downstream project decide how to save it (e.g. pickle for cuML).

Python wrappers are provided for IVF-PQ to save/load the index.

@tfeher tfeher added the DO NOT MERGE Hold off on merging; see PR for details label Oct 14, 2022
@tfeher tfeher force-pushed the fea_ivf_serialization branch from 2195740 to c13a23f Compare October 14, 2022 20:24
@tfeher tfeher force-pushed the fea_ivf_serialization branch from c13a23f to e6aa1be Compare October 24, 2022 07:35
@tfeher tfeher force-pushed the fea_ivf_serialization branch from e6aa1be to db91e6e Compare December 13, 2022 23:40
@cjnolet cjnolet changed the base branch from branch-22.12 to branch-23.02 December 13, 2022 23:41
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this PR is still in draft and you intend to do more work on it to prepare it for merging but I wanted to provide a first-pass review just to help nudge it along.

@tfeher tfeher added feature request New feature or request non-breaking Non-breaking change and removed DO NOT MERGE Hold off on merging; see PR for details labels Dec 19, 2022
@tfeher tfeher marked this pull request as ready for review December 19, 2022 11:39
@tfeher tfeher requested review from a team as code owners December 19, 2022 11:39
Copy link
Contributor Author

@tfeher tfeher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cjnolet for the review! I have addressed the issues.

Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, thanks @tfeher!

@cjnolet
Copy link
Member

cjnolet commented Jan 3, 2023

@gpucibot merge

@cjnolet
Copy link
Member

cjnolet commented Jan 3, 2023

rerun tests

@rapids-bot rapids-bot bot merged commit 96578a1 into rapidsai:branch-23.02 Jan 3, 2023
@tfeher tfeher deleted the fea_ivf_serialization branch October 25, 2023 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp feature request New feature or request non-breaking Non-breaking change python
Projects
Development

Successfully merging this pull request may close these issues.

2 participants