-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Description
OpenBLAS v0.3.28 will have a new feature allowing OpenBLAS to use the threadpool chosen by the user, (see OpenMathLib/OpenBLAS#4577).
This is very interesting because it would solve a performance issue happening when there's a quick succession of BLAS calls and OpenMP (prange) calls. The issue happens when OpenBLAS and OpenMP don't share the same threadpool because both threadpools are in active wait mode when they're idle (see OpenMathLib/OpenBLAS#3187 for details), which is a current situation since numpy and scipy wheels are built against OpenBLAS with the pthreads threading layer.
This issue is currently impacting some estimators like KMeans (#20642), NMF (#16439), pairwise_distances (#26097), ...
Being able to configure OpenBLAS to use our OpenMP threadpool would allow to get rid of this issue even if numpy and scipy keep building their wheels against OpenBLAS pthreads (which is very likely).
I'm not sure yet if or how OpenMathLib/OpenBLAS#4577 would make this possible so I'm opening this issue to track the progress on this subject.