Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced API #14198

Open
adrinjalali opened this issue Jun 26, 2019 · 9 comments
Open

Advanced API #14198

adrinjalali opened this issue Jun 26, 2019 · 9 comments
Labels
API Hard Hard level of difficulty

Comments

@adrinjalali
Copy link
Member

In a few places we've mentioned certain functions which we would like to have, but they don't necessarily fit the usual fit/transform/predict pattern.

For instance, @jnothman mentioned having an add function for the ColumnTransformer, which we can also have for the Pipeline.

We also talked about a prune_tree function in #14038.

One concern seems to be that we would like to keep the API very simple and easy, which I agree with. But it doesn't have to limit us from having more ad-hoc functions, which we could fit in a separate section in the docs and tag them as advanced.

I'm not sure how we could handle that with sphinx, but what I'm proposing is to tag those functions as advanced, and have sphinx render them in an "advanced" section bellow the other functions. This way they would not interfere with the usual experience of a new user who's reading the docs, and yet it would enable us to introduce some rather useful methods.

I may be missing some historical discussion on this topic though, sorry for that.

@adrinjalali adrinjalali added API Hard Hard level of difficulty labels Jun 26, 2019
@jnothman
Copy link
Member

jnothman commented Jun 26, 2019 via email

@adrinjalali
Copy link
Member Author

Since our convention (or even a constraint) is to validate model parameters at fit time, does it matter if the parameters are mutated using set_params or another handier function?

As for the numpydoc sections, the question is, is it worth it? If it would make us come to a consensus and agree to accept these extra pieces of API easier, I'd say it is. I'm not sure if it's a concern though.

@thomasjpfan
Copy link
Member

prune_tree differs from add in that it mutates the model

We can also design it to return a new instance of the DecisionTree* and not mutate the original tree.

@amueller
Copy link
Member

I think we had tons of ad-hoc methods in the past. Some became more standard and some we moved away from.

I don't think there's anything that prevents us from adding ad-hoc methods right now if they are warranted.

@jnothman
Copy link
Member

jnothman commented Jun 27, 2019 via email

@adrinjalali
Copy link
Member Author

Yes, but we also almost always make sure the validation is done during fit. So it doesn't really matter how the user sets the parameters, using which function, does it?

@jnothman
Copy link
Member

jnothman commented Jun 27, 2019 via email

@amueller
Copy link
Member

Thanks for the clarification, that wasn't clear to me from the initial description.
I'm +0 on an add method right not, I think. I haven't found this to be an inconvenience so far.

@jnothman
Copy link
Member

jnothman commented Jun 28, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Hard Hard level of difficulty
Projects
None yet
Development

No branches or pull requests

4 participants