Skip to content

Commit 57db1ed

Browse files
SultanOrazbayevjinglinpeng
authored andcommitted
fix(eda):type error for npartitions
Using `np.ceil` return a float, which triggers errors downstream. ```python In [5]: import numpy as np In [6]: np.ceil(1.0001) Out[6]: 2.0 ``` For a reproducible error example, see this question: https://stackoverflow.com/questions/72453608/dataprep-eda-typeerror-please-provide-npartitions-as-an-int-or-possibly-as-non.
1 parent 0686ff5 commit 57db1ed

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

dataprep/clean/utils.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import dask.dataframe as dd
77
import numpy as np
88
import pandas as pd
9+
from math import ceil
910

1011

1112
NULL_VALUES = {
@@ -69,7 +70,7 @@ def to_dask(df: Union[pd.DataFrame, dd.DataFrame]) -> dd.DataFrame:
6970
return df
7071

7172
df_size = df.memory_usage(deep=True).sum()
72-
npartitions = np.ceil(df_size / 128 / 1024 / 1024) # 128 MB partition size
73+
npartitions = ceil(df_size / 128 / 1024 / 1024) # 128 MB partition size
7374
return dd.from_pandas(df, npartitions=npartitions)
7475

7576

0 commit comments

Comments
 (0)