-
-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF float32 propagation in GaussianMixture #30415
Conversation
@ogrisel I added a small adjustment and the float32 tests seem to pass now. Do we have any way to invoke them in the CI as well? |
Yes, by pushing a commit with the https://scikit-learn.org/stable/developers/contributing.html#commit-message-markers |
Okay nice let me see if the CI passes first before trying out float32 because for the initial pipelines the |
I confirm that the tests now pass locally. Let me add a changelog entry. |
Let me do a quick benchmark to see the impact of this change. |
On my Apple M1 with accelerate (
So a nice ~3X speed up and a 2X memory efficiency improvement. Benchmark script: from sklearn.mixture import GaussianMixture
from sklearn.datasets import make_blobs
import numpy as np
from time import perf_counter
X, y = make_blobs(n_samples=int(1e6), n_features=50, centers=10, random_state=0)
X = X.astype(np.float32)
tic = perf_counter()
GaussianMixture(n_components=10, covariance_type="full", random_state=0).fit(X)
toc = perf_counter()
print(f"Elapsed time: {toc - tic:.1f} s") I used the scalene profiler to report memory usage ( |
I also tried with OpenBLAS ( Memory usage is the same for OpenBLAS and Accelerate. |
Even then there is a nice speedup. |
Once this is merged, I think this estimator would be a nice candidate for array API support. |
Great work! Beyond performance enhancement, in your tests, do you obtain identical results after downcasting X to np.float32? |
The existing tests still pass when we run them with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable to me. A few little comments.
One question: is there a unit test that checks that the dtype
of gmm.predict(X)
matches the dtype of X
or should it match the dtype
of X
that was used for fitting? Or do we not need that kind of test for some reason?
|
sklearn/mixture/_base.py
Outdated
@@ -120,15 +120,15 @@ def _initialize_parameters(self, X, random_state): | |||
resp[np.arange(n_samples), label] = 1 | |||
elif self.init_params == "random": | |||
resp = random_state.uniform(size=(n_samples, self.n_components)) | |||
resp /= resp.sum(axis=1)[:, np.newaxis] | |||
resp /= resp.sum(axis=1)[:, np.newaxis].astype(X.dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it matters that much, but it seems like resp
is always going to be float64 (apparently .uniform
does not let you specify a dtype
). In particular the .astype(X.dtype)
is not going to affect resp.dtype
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed! TIL. If resp.dtype
is not float32
then dividing by a float32
won't change the dtype of it :-/
I guess we'd have to add a resp = resp.astype(X.dtype)
and maybe can remove the astype
from resp /= resp.sum(axis=1)[:, np.newaxis].astype(X.dtype)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this was a valid point. Thanks for catching it @lesteve
I updated this and also added global_dtype
in a test that uses this particular initialization.
doc/whats_new/upcoming_changes/sklearn.mixture/30415.efficiency.rst
Outdated
Show resolved
Hide resolved
doc/whats_new/upcoming_changes/sklearn.mixture/30415.efficiency.rst
Outdated
Show resolved
Hide resolved
Thanks, @lesteve and @OmarManzoor for the latest round of fixes / testing. |
sklearn/mixture/_base.py
Outdated
resp = random_state.uniform(size=(n_samples, self.n_components)).astype( | ||
X.dtype | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure this makes a big difference but we may as well avoid a copy if X.dtype
is float64
(which I guess is the most common use case)?
resp = random_state.uniform(size=(n_samples, self.n_components)).astype( | |
X.dtype | |
) | |
resp = np.asarray( | |
random_state.uniform(size=(n_samples, self.n_components)), dtype=X.dtype | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I have enabled auto-merge!
This is a draft PR to explore how much work is need to ensure floating-point precision propagation in
GaussianMixture
as part of the discussion in #30382.There are still some failing tests to fix.