-
-
Notifications
You must be signed in to change notification settings - Fork 25.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advanced API #14198
Comments
Adding new numpydoc sections may not be so easy...
prune_tree differs from add in that it mutates the model, not the
parameters.
|
Since our convention (or even a constraint) is to validate model parameters at As for the numpydoc sections, the question is, is it worth it? If it would make us come to a consensus and agree to accept these extra pieces of API easier, I'd say it is. I'm not sure if it's a concern though. |
We can also design it to return a new instance of the |
I think we had tons of ad-hoc methods in the past. Some became more standard and some we moved away from. I don't think there's anything that prevents us from adding ad-hoc methods right now if they are warranted. |
A builder method, like ColumnTransformer().add(columns=categorical,
transformer=OneHotEncoder()).add(columns=~categorical,
transformer=StandardScaler()) is not like the ad-hoc methods we've had
before. We've previously had parameters set only by construction,
set_params or setattr.
|
Yes, but we also almost always make sure the validation is done during |
I agree; that's essentially why I proposed it. I just think it's a
departure from existing "ad hoc methods".
|
Thanks for the clarification, that wasn't clear to me from the initial description. |
In ColumnTransformer the benefit of 'add' is that it gets rid of the need
to remember the order of a triple because the method has parameter names.
It means that you can specify or use the default name for a step, rather
than the all-or-nothing approach of make_pipeline vs Pipeline. It would
make it easier to integrate other ways for users to specify the set of
columns too.
|
In a few places we've mentioned certain functions which we would like to have, but they don't necessarily fit the usual
fit
/transform
/predict
pattern.For instance, @jnothman mentioned having an
add
function for theColumnTransformer
, which we can also have for thePipeline
.We also talked about a
prune_tree
function in #14038.One concern seems to be that we would like to keep the API very simple and easy, which I agree with. But it doesn't have to limit us from having more ad-hoc functions, which we could fit in a separate section in the docs and tag them as advanced.
I'm not sure how we could handle that with sphinx, but what I'm proposing is to tag those functions as advanced, and have sphinx render them in an "advanced" section bellow the other functions. This way they would not interfere with the usual experience of a new user who's reading the docs, and yet it would enable us to introduce some rather useful methods.
I may be missing some historical discussion on this topic though, sorry for that.
The text was updated successfully, but these errors were encountered: