Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/expected categories #1597

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Prev Previous commit
Next Next commit
fix mypy complaints
  • Loading branch information
ColdTeapot273K committed Aug 24, 2024
commit 4ff7bb2f53715cd4cf25d0f6a524190c5a12ec72
9 changes: 5 additions & 4 deletions river/preprocessing/ordinal.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ class OrdinalEncoder(base.MiniBatchTransformer):
{'country': 0, 'place': 3}
{'country': 0, 'place': 0}
{'country': -1, 'place': -1}

>>> import pandas as pd
>>> xb1 = pd.DataFrame(X[0:4], index=[0, 1, 2, 3])
>>> xb2 = pd.DataFrame(X[4:8], index=[4, 5, 6, 7])
Expand All @@ -107,13 +107,14 @@ class OrdinalEncoder(base.MiniBatchTransformer):

def __init__(
self,
categories: str | dict = "auto",
categories = "auto",
unknown_value: int | None = 0,
none_value: int = -1,
):
self.unknown_value = unknown_value
self.none_value = none_value
self.categories = categories
self.values: collections.defaultdict | dict | None = None

if self.categories == "auto":
# We're going to have one auto-incrementing counter per feature. This counter will generate
Expand All @@ -124,9 +125,9 @@ def __init__(

# We're going to store the categories in a dict of dicts. The outer dict will map each
# feature to its inner dict. The inner dict will map each category to its code.
self.values: collections.defaultdict = collections.defaultdict(dict)
self.values = collections.defaultdict(dict)
else:
self.values: dict = self.categories
self.values = self.categories


def transform_one(self, x):
Expand Down
Loading