Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REVIEW] Overlap epilog compute with ldg of next grid stride in pairwise distance & fusedL2NN kernels #292

Merged
merged 45 commits into from
Jul 15, 2021

Conversation

mdoijade
Copy link
Contributor

overlap epilog compute with ldg of next grid stride in pairwise distance base class.
gives 2-3% perf improvement for pairwise distance kernels and fusedL2NN kernel.

mdoijade added 30 commits May 12, 2021 22:03
…r usage in all contraction based kernels so that n is along x dir and m is along y dir blocks
…kernels. --add launch config generator function to launch optimal grid size kernel for these pairwise dist kernels
…ed up over previous version. -- improve logic of the grid launch config generator for x-dir blocks
… for subsequent gridStrideX variations. this overall improves perf of fusedL2NN to 1.85x over previous version. --Also remove checking keys only check values in fusedL2nn test case, as it may happen a row has multiple keys with same min val
…und in launchConfigGenerator. --Use constexpr in shmemSize.
…e sure next grid stride doesn't pollute shmem before completion of this calculation
@mdoijade mdoijade requested review from a team as code owners July 13, 2021 12:35
@github-actions github-actions bot added the cpp label Jul 13, 2021
@mdoijade
Copy link
Contributor Author

@teju85 for help with review.

@dantegd dantegd added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jul 13, 2021
Copy link
Member

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM.

@dantegd
Copy link
Member

dantegd commented Jul 15, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 35411a0 into rapidsai:branch-21.08 Jul 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants