Skip to content

Commit

Permalink
PERF set openmp to use only physical cores by default (#26082)
Browse files Browse the repository at this point in the history
  • Loading branch information
ogrisel authored Apr 6, 2023
1 parent a7a416f commit 5b46d01
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 5 deletions.
11 changes: 11 additions & 0 deletions doc/whats_new/v1.3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,17 @@ Changes impacting all modules

:pr:`25044` by :user:`Julien Jerphanion <jjerphan>`.

- |Enhancement| All estimators that internally rely on OpenMP multi-threading
(via Cython) now use a number of threads equal to the number of physical
(instead of logical) cores by default. In the past, we observed that using as
many threads as logical cores on SMT hosts could sometimes cause severe
performance problems depending on the algorithms and the shape of the data.
Note that it is still possible to manually adjust the number of threads used
by OpenMP as documented in :ref:`parallelism`.

:pr:`26082` by :user:`Jérémie du Boisberranger <jeremiedbb>` and
:user:`Olivier Grisel <ogrisel>`.

Changelog
---------

Expand Down
27 changes: 22 additions & 5 deletions sklearn/utils/_openmp_helpers.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@ import os
from joblib import cpu_count


# Module level cache for cpu_count as we do not expect this to change during
# the lifecycle of a Python program. This dictionary is keyed by
# only_physical_cores.
_CPU_COUNTS = {}


def _openmp_parallelism_enabled():
"""Determines whether scikit-learn has been built with OpenMP
Expand All @@ -12,7 +18,7 @@ def _openmp_parallelism_enabled():
return SKLEARN_OPENMP_PARALLELISM_ENABLED


cpdef _openmp_effective_n_threads(n_threads=None, only_physical_cores=False):
cpdef _openmp_effective_n_threads(n_threads=None, only_physical_cores=True):
"""Determine the effective number of threads to be used for OpenMP calls
- For ``n_threads = None``,
Expand All @@ -33,6 +39,15 @@ cpdef _openmp_effective_n_threads(n_threads=None, only_physical_cores=False):
- Raise a ValueError for ``n_threads = 0``.
Passing the `only_physical_cores=False` flag makes it possible to use extra
threads for SMT/HyperThreading logical cores. It has been empirically
observed that using as many threads as available SMT cores can slightly
improve the performance in some cases, but can severely degrade
performance other times. Therefore it is recommended to use
`only_physical_cores=True` unless an empirical study has been conducted to
assess the impact of SMT on a case-by-case basis (using various input data
shapes, in particular small data shapes).
If scikit-learn is built without OpenMP support, always return 1.
"""
if n_threads == 0:
Expand All @@ -47,10 +62,12 @@ cpdef _openmp_effective_n_threads(n_threads=None, only_physical_cores=False):
# to exceed the number of cpus.
max_n_threads = omp_get_max_threads()
else:
max_n_threads = min(
omp_get_max_threads(),
cpu_count(only_physical_cores=only_physical_cores)
)
try:
n_cpus = _CPU_COUNTS[only_physical_cores]
except KeyError:
n_cpus = cpu_count(only_physical_cores=only_physical_cores)
_CPU_COUNTS[only_physical_cores] = n_cpus
max_n_threads = min(omp_get_max_threads(), n_cpus)

if n_threads is None:
return max_n_threads
Expand Down

0 comments on commit 5b46d01

Please sign in to comment.