API: Make np.squeeze always return an ndarray for scalar inputs#31561
API: Make np.squeeze always return an ndarray for scalar inputs#31561spandan11106 wants to merge 2 commits into
Conversation
mhvk
left a comment
There was a problem hiding this comment.
Thanks for the PR. Some comments on the tests in-line, but before worrying about those, I guess we'll need to think what the best behaviour is. As you note, the docstring is pretty clear that an array should be returned but of course the actual implementation is to defer to an object's squeeze() method. And a problem with the approach here is that np.squeeze(scalar) and scalar.squeeze() will now no longer do the same thing. I think that would be even more surprising.
I think we'll need to make a call. Either,
- Adjust
np.generic.squeeze()to return an array; or - Adjust the docstring of
np.squeeze()to note explicitly it defers to.squeeze()and that numpy scalars remain scalars.
I'm not sure what is best, though in the face of doubt, I think I slightly prefer to adjust the docstring to match reality rather than the reverse. Let me ping @seberg, @mattip for other opinions.
p.s. Unrelated, but I don't really like that np.squeeze(scalar, axis=0) does not error.
| assert result[()] == scalar | ||
|
|
||
| # axis=0 should also work for scalar inputs | ||
| result = np.squeeze(np.int_(1), axis=0) |
There was a problem hiding this comment.
I don't know that I would bother to test what to me seems surprising behaviour (I would expect an error, like for axis=1...)
| assert_raises(ValueError, a.squeeze, axis=(1,)) | ||
| assert_equal(a.squeeze(axis=(2,)), [[1, 2, 3]]) | ||
|
|
||
| def test_squeeze_scalar_returns_0d_array(self): |
There was a problem hiding this comment.
To me, it would make more sense to move the tests to test_numeric.py, since we're testing a function from fromnumeric
| def test_squeeze_scalar_returns_0d_array(self): | ||
| # np.squeeze should always return an ndarray, even for | ||
| # np.generic scalar inputs (gh-27709) | ||
| for scalar in [np.int_(1), np.float64(2.5), np.complex128(1+2j)]: |
There was a problem hiding this comment.
I'd use @pytest.mark.parametrize on the function. You can then include a python scalar too.
| # np.generic scalar inputs (gh-27709) | ||
| for scalar in [np.int_(1), np.float64(2.5), np.complex128(1+2j)]: | ||
| result = np.squeeze(scalar) | ||
| assert isinstance(result, np.ndarray), ( |
There was a problem hiding this comment.
No need for the explanation in the assert.
Yeah, I asked the same on the issue. I think Joren would prefer to make it typing clear one way or another. For myself, it feels a bit silly for EDIT: We should probably move to the issue, but I am not sure we can move this PR forward until we have a decision there. |
|
@seberg - oops, should have seen that this was to fix an issue, I've now moved discussion there. FWIW, I'm now leaning even more to just have the docstring match the implementation instead of the reverse. |
PR summary
Closes #30109.
Currently,
np.squeezebehaves inconsistently with scalar-like inputs:The issue is that
np.genericsubclasses have a.squeeze()method that returnsself(the scalar itself). The currentsqueezeimplementation infromnumeric.pytriesa.squeezefirst, which succeeds fornp.genericscalars and returns the scalar unchanged. This contradicts the docstring, which states:Changes:
np.genericinputs to 0d arrays innumpy/_core/fromnumeric.py.numpy/_core/fromnumeric.pyito remove the scalar-returning overload ofsqueeze(so it falls back to returningNDArray[ScalarT]).numpy/_core/tests/test_multiarray.pychecking various scalar types,axis=0, and Python scalars.doc/release/upcoming_changes/27709.compatibility.rst.First time committer introduction
Hi everyone! I am Spandan. I use NumPy for development/data science work, and noticed the inconsistent behavior when squeezing NumPy scalars compared to Python scalars. I wanted to contribute to the repository to clean up this edge case.
AI Disclosure
Antigravity was used to assist in researching the issue and drafting the test cases.