-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] Improvents to RNG #434
[REVIEW] Improvents to RNG #434
Conversation
Can one of the admins verify this patch? |
add to allowlist |
Not a comment, but a question @vinaydes |
@teju85 If you want each bit to have 50% chance of being either 1 or 0, then |
Yes, that's correct. I'd like to have them 1/0 with 50% probability. Thanks for the confirmation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General comments from me:
cpp/test/random/rng.cu:
- The comments mention sigma being multiplied by 4 (line 101) or 3-4 (line 185), but then it actually seems to multiply by 10 (line 102 and 244-245, 289-290). Is this because we actually have the variance instead of standard deviation? But in this case, why is it done with both mean and variance values?
cpp/include/raft/random/rng.hpp:
- I believe all members of the main
Rng
class are public at the moment. I think it would make sense to make them protected. However, I also think like we should allow users to manually setoffset
andseed
, as well as provide a simple helper function to update theoffset
as needed: the arguments would be the total number of RNG calls and the type of RNG call (some will use the underlying generator twice or more). As of now, what the PR does is to simply reset the seed using the mt19937 generator. This is doable but not ideal since generators on GPU should kind of mimic CPU generators where there is a difference between offsetting (using the same sequence but at a later point in the period) and re-seeding (starting an entirely new sequence).
cpp/include/raft/random/rng_impl.cuh:
- I see that you're using the "lemire" method for bounded integer generation. Does it make sense to add the swift method here? It seems like it performs well, especially with PCG, although I don't mind keeping both methods around since it may depend on exact use cases.
EDIT: I'd be happy to discuss everything in more detail over Slack or a meeting
I have tried to summarize status of all the action items came up with regard to this PR. Some of the items have to deferred in order to get this PR merged sooner. From Matthias
From Corey
Bonus
@MatthiasKohl @cjnolet Let me know if I have missed any point. |
@vinaydes Thank you very much for the detailed overview of items above! A few minor comments from my side, all of which can be deferred to another PR:
|
@cjnolet can we get a final review and merge this PR soon, please? cuML RF and cugraph-ops sampling are blocked on this. |
@teju85 this PR looks fine to go in but it is blocked by rapidsai/cuml#4562 since it's a breaking change on the cuml side. |
Ah got it. Thanks Corey. That PR still seems to have build issues @vinaydes ? |
Working to fix it. It is a tiny change. |
@vinaydes thw cuml build will be broken until this PR is merged: rapidsai/cuml#4548. It might help to merge that into your branch RNG branch for now. |
Right, I saw the breaking changes and for now reverted merge of 22.04 locally. I'll use 4548 once I fix the build issue on older commit. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just making a note to merge after the corresponding cuml side passes CI.
@MatthiasKohl Addressed your comments with latest push. I am deferring the task of reconciliation of |
Thanks, looks good to me! |
@gpucibot merge |
With this change proposing number of improvements to the Random Number Generator (RNG) class of RAFT