-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVTX range helpers #416
NVTX range helpers #416
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really happy to host this in RAFT and hopeful other consumers might also be able to make use of it. Mostly minor things and my thoughts on the macro usage.
Also tagging some folks from cugraph for their thoughts about the macro |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look great, I just found one more very small thing as I was doing my final review.
As @teju85 suggested, I probably should also remove the versions of the helpers that synchronize cuda streams. The reason is that modern nsys profiler can map the ranges on gpu streams well, but addition of the lots of synchronize calls may bias the profiler/performance overview. @cjnolet, @wphicks do you have any objections to that? |
If this is the case with all the ctk versions supported by rapids (Eg ctk 11+) then it's fine with me. Probably better not to modify the performance while we are trying to measure the performance. |
No objections here. Makes sense! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since no one else has a strong opinion on the macro issue, I say it's better to be biased toward approving and we should move ahead. Let's take care of the stream synchronization issue and @cjnolet's last note on namespacing, then this looks good as far as I'm concerned.
All done, thanks for the input! Note, I also added a template parameter One creates a domain by declaring a struct (anywhere): raft/cpp/include/raft/common/nvtx.hpp Lines 32 to 34 in d8357b2
And this is how to use it: raft/cpp/include/raft/linalg/svd.cuh Lines 67 to 68 in d8357b2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look great, Artem. Thanks for moving this to raft and improving it!
Just a little side note: the usage links are great but it would also be nice to have usage examples right in the doxygen. |
rerun tests |
@gpucibot merge |
Move NVTX range helpers to raft and extend them a little bit. Corresponding raft PR: rapidsai/raft#416 . Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: #4445
Move NVTX range helpers to raft and extend them a little bit. Corresponding raft PR: rapidsai/raft#416 . Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#4445
Moves the helpers for NVTX ranges from cuml with minor additions.
Also adds the helpers to couple places in the codebase - as a way of testing the new feature.