-
-
Notifications
You must be signed in to change notification settings - Fork 26.5k
Description
Most functions in scipy.linalg functions (e.g. svd, qr, eig, eigh, pinv, pinv2 ...) have a default kwarg check_finite=True that we typically leave to the default value in scikit-learn.
As we already validate the input data for most estimators in scikit-learn, this check is redundant and can cause significant overhead, especially at predict / transform time. We should probably always call those method with an explicit check_finite=False in scikit-learn.
This issue shall probably be addressed in many PRs, probably one per module that imports scipy.linalg.
We should still make sure that the estimators raise a ValueError with the expected error message when fed with numpy arrays with infinite some values (-np.inf, np.inf or np.nan). This can be done manually by calling sklearn.utils.estimator_checks.check_estimators_nan_inf on the estimator, which should be automatically be called by sklearn.tests.test_common but we need to check that it's actually the case when reviewing such PRs.