-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SVC and OneClassSVM fails to fit or have wrong fitted attributes with null sample weights #25380
Comments
This bug report replaces bug#19654, which misattributed the bug to libsvm_sparse specifically. |
Thanks for the report. Indeed this could be fixed. So I'll tag this accordingly. Would be nice if you could submit a PR to fix it @andrewdelong |
/take |
I am trying to work on this issue, and trying the workaround as suggested by @andrewdelong
However, the results are not as expected Results
Need your suggestions on what to try next? @adrinjalali @andrewdelong |
I certainly wouldn't try to change sample weights as a workaround. We probably want to run svm with those classes omitted in this case. |
So one way we can do this is to remove values from X, y, and w, where w == 0. Or we can find a way to skip this in training ? |
@rand0wn figuring it out would be a part of solving this issue 😁 I haven't jumped deep into that part of the codebase in a while. |
Got it will try to figure out ways it's possible, and run the testcases |
On Investigation, all the variations except multiclass SVC remove zero_weights
Either we can do the same with multiclass as well or return an error For example in LinearSVC
While with Linear Kernel in SVC
Also when removing zero-weight classes
Gives the same thing without warning, my suggestion is either to remove zero weight classes for all to make it consistent or return an error, thoughts? |
I think making it consistent with the other cases makes sense. Thanks for investigating it @rand0wn |
Looking at the code I don't think that the results of the investigation is right. As mentioned in the original post, the dense and sparse case are wrong. The sparse case fail for inconsistent shape but the dense case shows the issue: the internal fit of libsvm was fit only on 2 class and thus we have an issue in the reported values. I did not check the code yet but we should take care of some offsets that should be apply once the fit is done. |
The All of these need to be done in the At least we have a good non-regression case. But as far I can say, we should suffer from the same issue for the |
Describe the bug
SVC().fit(X, y, w) fails when the targets y are multiclass and the sample_weights w zero out one of the classes.
A warning is emitted (e.g., "class label 0 specified in weight is not found"), but it does not indicate that the arrays on the trained SVC object are incorrect.
Seems to be a case that was not tested by PR #14286.
Workaround
Replace the zero weights (or negative weights) with very small values like 1e-16.
Steps/Code to Reproduce
Expected Results
The fitted attributes should be
assuming the 'arbitrary' values in dual_coef_ are set to zero.
Actual Results
For dense X, the fitted attributes are actually
For sparse X it raises
with traceback
Versions
The text was updated successfully, but these errors were encountered: