Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogisticRegression's regularization is scaled by the dataset size #30308

Open
louisabraham opened this issue Nov 19, 2024 · 7 comments
Open

LogisticRegression's regularization is scaled by the dataset size #30308

louisabraham opened this issue Nov 19, 2024 · 7 comments

Comments

@louisabraham
Copy link

Describe the workflow you want to enable

Other linear models on https://scikit-learn.org/1.5/modules/linear_model.html have regularization that doesn't depend on the dataset size

Describe your proposed solution

It would be good to either change the behavior or document it very very very clearly, not only in the user guide as it is now but also in the model documentation.

Describe alternatives you've considered, if relevant

No response

Additional context

No response

@louisabraham louisabraham added Needs Triage Issue requires triage New Feature labels Nov 19, 2024
@glemaitre glemaitre added Documentation and removed New Feature Needs Triage Issue requires triage labels Nov 19, 2024
@glemaitre
Copy link
Member

I don't think that we are ready to pay for the change of behaviour cost. However, I don't mind that we had a note in the docstring of the estimator.

Just pinging @lorentzenchr if he has additional thoughts.

@virchan
Copy link
Contributor

virchan commented Nov 20, 2024

I'm not speaking on @lorentzenchr's behalf, but I came across a similar discussion on Ridge/Lasso regression where he shared an incredibly helpful comment. I believe it’s relevant to this case as well.

That said, I still think @lorentzenchr is the best person to weigh in on this.

@lorentzenchr
Copy link
Member

Thanks, @virchan, for your nice words and the link(s). So I don't need to repeat those arguments again.

Documentation:
If someone wants an improvement, then a PR is welcome.

Model Change:

I don't think that we are ready to pay for the change of behaviour cost.

I see it a bit different than @glemaitre. I'm open to change the penalty parameter of all linear models to be consistent. I find the different ways of different estimators quite annoying and error-prone.
This is, however, a decision that needs consensus among core contributors, therefore ping @scikit-learn/Core-devs.

@adrinjalali
Copy link
Member

I wouldn't mind the change to make them consistent.

@thomasjpfan
Copy link
Member

thomasjpfan commented Nov 22, 2024

I'm open to change the penalty parameter of all linear models to be consistent. I find the different ways of different estimators quite annoying and error-prone.

To be backward compatible, can we add a new parameter first like scale_regularization = {"n_samples", "none"}? Then we can make everything consistent in terms of this parameter (with a deprecation cycle if it is required for that estimator).

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Nov 22, 2024 via email

@louisabraham
Copy link
Author

louisabraham commented Nov 22, 2024

Thank you to all for paying so much attention to this problem! (and congrats for the creation of probabl)

If I may add something, what should the default be? loss.sum() + lambda * reg or loss.mean() + lambda * reg?

I feel like the answer should be, whichever is more stable when using CV to select the final lambda value, but I'm not sure which one it should be, aka which one selects the best lambda when evaluating on subsets.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants