Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I can not achieve inplace scaling by using sklearn.preprocessing.minmax_scale #27307

Closed
guanjiesun opened this issue Sep 6, 2023 · 6 comments · Fixed by #27691
Closed

I can not achieve inplace scaling by using sklearn.preprocessing.minmax_scale #27307

guanjiesun opened this issue Sep 6, 2023 · 6 comments · Fixed by #27691
Assignees

Comments

@guanjiesun
Copy link

Describe the bug

By setting the copy=False, ndarray data has not changed unexpectedly

image

Steps/Code to Reproduce

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import sklearn.preprocessing as pre

np.random.seed(10)
data = np.random.randint(1, 10, size=(5, 3))
print(data)
pre.minmax_scale(data, feature_range=(0, 1), axis=0, copy=False)
print(data)

Expected Results

A reasonable explanation about the copy parameter of minmax_scala funciton

Actual Results

There are no warings and errors, just the result is not wrong!

image

Versions

Python dependencies:
      sklearn: 1.3.0
          pip: 23.2.1
   setuptools: 65.5.0
        numpy: 1.25.2
        scipy: 1.11.2
       Cython: None
       pandas: 2.0.3
   matplotlib: 3.7.2
       joblib: 1.3.2
threadpoolctl: 3.2.0
@guanjiesun guanjiesun added Bug Needs Triage Issue requires triage labels Sep 6, 2023
@lesteve
Copy link
Member

lesteve commented Sep 6, 2023

copy=False only works if the input array dtype is a float dtype, i.e. float64, float32 or float16 right now. I guess maybe the documentation could be improved to mention this?

In your case the input array dtype is an int dtype.

@lesteve lesteve added Documentation help wanted and removed Bug Needs Triage Issue requires triage labels Sep 7, 2023
@TaiJuWu
Copy link

TaiJuWu commented Sep 7, 2023

After this line
There is a new instance of X and the data type of new one is float.
So maybe you should modify your code to below.
data=pre.minmax_scale(data, feature_range=(0, 1), axis=0, copy=False)

@guanjiesun
Copy link
Author

copy=False only works if the input array dtype is a float dtype, i.e. float64, float32 or float16 right now. I guess maybe the documentation could be improved to mention this?

In your case the input array dtype is an int dtype.

Yes, thanks for you answer! The official doc really needs an improvement.

@guanjiesun
Copy link
Author

After this line There is a new instance of X and the data type of new one is float. So maybe you should modify your code to below. data=pre.minmax_scale(data, feature_range=(0, 1), axis=0, copy=False)

Thanks for you reply, but I think there should't return anything of the funtion and just make the data normalized inplacely.

After I set the dtype of data to np.float, data is normalized inplace, but still return a useless copy of data, i.e., data_copy, this does make nonse and not consistent with the official doc" copy =False would avoid a copy of data".

image

Official doc about the copy parameter of function minmax_scale

image

@karthic25
Copy link

/take

@konstantinos-p
Copy link
Contributor

This issue had stalled to the best of my knowledge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants