-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust Hellinger pairwise distance to vaoid NaNs #189
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-approving since it's a small requested change
[=] __device__(value_t input) { return sqrt(1 - input); }, | ||
[=] __device__(value_t input) { | ||
// Adjust to replace NaN in sqrt with 0 if input to sqrt is negative | ||
value_t rectifier = (1 - input) > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be bool. Just combine this with next line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are supposed to be probabilities but we should just run an L1 normalization on the input here to guarantee it's positive. (that's what UMAP does, which provides this with their API). This will not change the data if it's already probabilities.
Even when the inputs are L1 normalized this can arise. Running A solution could be to run csr_row_normalize_l1 and keep this rectifier for negative epsilon, or to add an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@gpucibot merge |
This change will fix NaNs that arise in Hellinger pairwise distance when the input to sqrt is negative.
These kind of inputs can happen even when the inputs are normalized row-wise to 1 due to accumulation of rounding approximation.