Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threads awaiting for GIL in Forest estimators #20666

Open
glemaitre opened this issue Aug 3, 2021 Discussed in #20651 · 6 comments
Open

Threads awaiting for GIL in Forest estimators #20666

glemaitre opened this issue Aug 3, 2021 Discussed in #20651 · 6 comments
Labels
free-threading PRs and issues related to support for free-threaded CPython (a.k.a. nogil or no-GIL, PEP 703) module:ensemble Performance

Comments

@glemaitre
Copy link
Member

Discussed in #20651

In forest algorithms, the preferred parallelization backend is threading. However, it looks that it is not anymore the most appropriate backend. As discussed here, it might be that the GIL is not explicitly released in some part of the code locking the execution of the thread.

We need to investigate more to solve this issue.

@jjerphan
Copy link
Member

jjerphan commented Aug 3, 2021

From #20651 (reply in thread), two small optimizations could improve the scalability when using threads:

  • avoiding the redundant check_random_state in BaseDecisionTree.fit as it's already initialized by the forest when bootstrap to data for the tree specifically.
  • avoiding the assert_all_finite in _check_sample_weight in BaseDecisionTree.fit.

For reference, here is the extract of the sequential execution profiling:
Screenshot 2021-08-03 at 18-42-47 viztracer json (160 MB) - Perfetto UI

@ogrisel
Copy link
Member

ogrisel commented Aug 4, 2021

I think threading is still the most appropriate backend by default. In #20651 the individual trees are very fast to train (4ms) and many trees are fitted (1000) which is not representative of the typical use case.

If the tree fitting time lasted 1s or more, then the GIL-holding segments of the code would be negligible and the thread-based scalability would be fine.

@brianbien
Copy link

brianbien commented Aug 4, 2021

not representative

I agree, it's dataset dependent. Switch out load_boston for fetch_california in my example to get a larger training set, and I see the expected speedup cross-platform.

@Bhavay-2001
Copy link

hey @ogrisel @brianbien , could you please help me resolving this issue? I would like to contribute to it

@ogrisel
Copy link
Member

ogrisel commented Nov 2, 2021

@Bhavay192 Feel free to have a look at the code of the random forests and the decision trees to try to open a PR to solve one item of #20666 (comment) at a time to keep the PR focused.

Feel free to ask specific questions on gitter if you need more help: https://gitter.im/scikit-learn/scikit-learn

@ogrisel ogrisel added the free-threading PRs and issues related to support for free-threaded CPython (a.k.a. nogil or no-GIL, PEP 703) label Oct 4, 2024
@ogrisel
Copy link
Member

ogrisel commented Oct 4, 2024

Before opening a PR for this issue, we should reevaluate with a free-threading build of scikit-learn on CPython 3.13.

Here are some resources to get started: https://py-free-threading.github.io/

Scikit-learn's CI also automatically publishes nightly wheels for cp313t as numpy and scipy (along with the other nightly wheels):

https://scikit-learn.org/stable/developers/advanced_installation.html#installing-nightly-builds

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
free-threading PRs and issues related to support for free-threaded CPython (a.k.a. nogil or no-GIL, PEP 703) module:ensemble Performance
Projects
None yet
Development

No branches or pull requests

6 participants