Enhance "Choosing the Right Estimator" Graphic (scikit-learn algorithm cheat sheet) #30354

sylvaincom · 2024-11-27T11:16:31Z

Describe the issue linked to the documentation

In its user guide, scikit-learn offers a Choosing the right estimator which is an interactive scikit-learn algorithm cheat sheet that is great.

When thinking about new features for skore, I thought of enhancing the user guide and have a pedagogical table which, for each estimator, says:

if it needs to be scaled,
if it can handle categorical features,
if it can handle missing data,
if it holds some randomness (and where / why),
if it can be paralleled,
etc (full proper list to be determined).

EDIT:

The scikit-learn graph / map is great, but not sufficient IMHO because I would like to have, for each estimator, if I need to normalize the data or not, etc -> guidelines for each estimator
I would like a table that is separate from the map, this is also a cheat sheet but not to appear on the map, maybe at the bottom of the map on the same user guide page

When discussing this with @jeromedockes and @Vincent-Maladiere, they told me about scikit-learn's estimator tags such as is_regressor. It seems that that knowledge is already partially in the tags.

Suggest a potential alternative/fix

Maybe scikit-learn could have a table in the user guide with guidelines for each estimator?
Maybe scikit-learn could hold more tags? And the table could be built from those tags?

The text was updated successfully, but these errors were encountered:

virchan · 2024-11-27T13:16:28Z

Pinging @lesteve, @Charlie-XIAO, and @thomasjpfan, as they are more qualified than I am to comment on this. Apologies for the spam!

lesteve · 2024-11-27T13:37:05Z

IMO, the first thing to do is reduce/precise the scope and try to improve the situation by making incremental PRs.

About improving "Choosing the right estimator" map in a minimal way for example some quick recent thoughts #30283 (comment). Better suggestions more than welcome!

See #7686 for some attempts at improving the map. There is also #28314.

lesteve · 2024-11-27T14:17:10Z

Also about having doc that uses estimator tags, we kind of already do it in some places e.g. Estimators handling NaNs is actually generated with a sphinx directive using estimator tags.

sylvaincom · 2024-11-27T17:34:51Z

Sorry, let me try to clarify my point (I also edited my initial issue):

The scikit-learn graph / map is great, but not sufficient IMHO because I would like to have, for each estimator, if I need to normalize the data or not, etc -> guidelines for each estimator
I would like a table that is separate from the map, this is also a cheat sheet but not to appear on the map, maybe at the bottom of the map on the same user guide page
This table could be generated automatically from tags

Thanks for your feedback! Understood, will try to break down into more digestible tasks and will look into your provided links

sylvaincom added Documentation Needs Triage Issue requires triage labels Nov 27, 2024

ArturoAmorQ removed the Needs Triage Issue requires triage label Nov 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance "Choosing the Right Estimator" Graphic (scikit-learn algorithm cheat sheet) #30354

Enhance "Choosing the Right Estimator" Graphic (scikit-learn algorithm cheat sheet) #30354

sylvaincom commented Nov 27, 2024 •

edited

Loading

virchan commented Nov 27, 2024

lesteve commented Nov 27, 2024 •

edited

Loading

lesteve commented Nov 27, 2024 •

edited

Loading

sylvaincom commented Nov 27, 2024 •

edited

Loading

Enhance "Choosing the Right Estimator" Graphic (scikit-learn algorithm cheat sheet) #30354

Enhance "Choosing the Right Estimator" Graphic (scikit-learn algorithm cheat sheet) #30354

Comments

sylvaincom commented Nov 27, 2024 • edited Loading

Describe the issue linked to the documentation

Suggest a potential alternative/fix

virchan commented Nov 27, 2024

lesteve commented Nov 27, 2024 • edited Loading

lesteve commented Nov 27, 2024 • edited Loading

sylvaincom commented Nov 27, 2024 • edited Loading

sylvaincom commented Nov 27, 2024 •

edited

Loading

lesteve commented Nov 27, 2024 •

edited

Loading

lesteve commented Nov 27, 2024 •

edited

Loading

sylvaincom commented Nov 27, 2024 •

edited

Loading