Description
Hi,
I was wondering if it is possible to define the relevant splitter utility functions in the .pxd
file so that it is cimportable from 3rd party applications?
Motivation
3rd party applications will typically leverage the Cython code within scikit-learn even though it is not a publicly supported API. Refactoring the tree code has been discussed, but has been moving slowly due to lack of time from maintainers. In the meantime, one might want to use copy/modify some of the existing tree code to implement a new tree model.
In the splitter, they will typically still use common functions such as "sort", "extract_nnz_binary_search", "extract_nnz_index_to_samples", "sparse_swap" (I may be missing others).
Solution
Just include their headers in the sklearn.tree._splitter.pxd
file. This is similar to how common utility functions are also included in sklearn.tree._utils.pxd
.
This would be a reasonably small PR, with no side effects.