-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix euclidean_dist
in IVF-Flat search
#1122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, that must have been a tough investigation, good job! You've explained the problem very nice here, but maybe it's worth adding a short gist of it (or a link) in the code?
…underflowing unsigned subtraction with __usad for the absolute difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Louis for fixing this problem, the PR looks good to me!
Could you update the second part of the PR description? You say
added comment in the unsigned, non-vectorized implementation
But that has changed during the review.
Ah, yes, done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes LGTM. Thanks @Nyrio!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/merge |
Solves #1058
This is a tricky bug so the fix deserves some explanation. The previous implementation of
euclidean_dist
was the following in vectorized cases, wherex
andy
areint32
vectors of 4int8
each andacc
is a singleint32
number to accumulate the distance in:Now consider the following case:
The difference between -128 and 127 is 255, represented as
FF
(__vabsdiffs4
is smart enough not to computeabs(a-b)
which would result in01
). However, if we call the signed version ofdp4a
,FF
is cast fromint8
toint32
asFFFFFFFF
(or -1). The square of -1 is 1, which is added toacc
(instead of 65025).As the output of
__vabsdiffs4
is correct when considered as an unsigned number, and as addition is the same for signed and unsigned in 2's complement (andacc
is positive anyway), the easiest fix is to use the unsigned version ofdp4a
, which will cast overflowed differences properly to 32 bits. The previous code simply becomes:Additionally, to avoid underflows in the non-vectorized unsigned case, I replaced the subtraction with
__usad
(absolute difference of unsigned numbers). Note that using the subtraction was correct anyway, because the addition/subtraction is the same for unsigned and signed integers, as well as the least significant half of the multiplication (which is the part that is stored), and the square of a number is also the square of its opposite. Consider: