-
-
Save kiwidamien/1ee8d6217610be9ed1dcda81dbc9eba4 to your computer and use it in GitHub Desktop.
@marcelkore thanks for taking the time to drop a comment -- it helps to know I am not talking to myself and someone finds this useful! =)
I had a problem with Hashing Encoder and it seems like the problem may also happens to yours since I used all your code exactly the same.
Would you mind if you come and visit my github and see the problem?
Here is the URL : https://github.com/HaeHwan/hello-world/blob/master/Hashing(2).ipynb
The main problem is that HashEncoder doesn't change the columns at all as you can see on the above URL.
Thanks.
Hi @HaeHwan
That's strange .... I cannot duplicate your error. If you try running the following file in the terminal, what do you get?
import pandas as pd
import category_encoders as ce
print(f"""
Version check:
--------------
Pandas version: {pd.__version__}
Category Encoders version: {ce.__version__}
""")
df_train = pd.read_csv('https://raw.githubusercontent.com/kiwidamien/StackedTurtles/master/content/preprocessing/simple_loan_example.csv')
encoder_purpose = ce.HashingEncoder(n_components=3, cols=['purpose'])
df_transform = encoder_purpose.fit_transform(df_train)
print(df_transform)
For reference, my output is
Version check:
--------------
Pandas version: 0.24.2
Category Encoders version: 2.1.0
col_0 col_1 col_2 annual_income debt_to_income loan_amount grade repaid
0 0 0 1 120000 0.100 3500 A True
1 0 0 1 130000 0.500 13800 C False
2 0 0 1 220000 0.400 33500 B False
3 0 0 1 65000 0.250 2000 B False
4 0 0 1 60000 0.200 2200 B True
5 1 0 0 45000 0.312 5500 D True
6 1 0 0 75000 0.111 2000 B True
7 0 1 0 24000 0.400 500 C False
oh finally I solved it! maybe the problem was process number within my laptop pc. I plug "max_process = 1" and now it works thank you for your kindness
Hi @kiwidamien, Thanks for sharing this. It helps me a lot. I'm unable to open the link to "An introduction to pipelines". Can you please look into this?
your notebooks/posts are so easy to follow! I have spend the better part of this afternoon reviewing your posts. Thanks for sharing!