Skip to content

Commit 0150278

Browse files
feat: Adding docs outlining native Python transformations on singletons (feast-dev#4741)
1 parent 4a89252 commit 0150278

1 file changed

Lines changed: 105 additions & 71 deletions

File tree

Lines changed: 105 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,122 +1,147 @@
1-
# \[Beta] On demand feature view
1+
# [Beta] On Demand Feature Views
22

3-
**Warning**: This is an experimental feature. To our knowledge, this is stable, but there are still rough edges in the experience. Contributions are welcome!
3+
**Warning**: This is an experimental feature. While it is stable to our knowledge, there may still be rough edges in the experience. Contributions are welcome!
44

55
## Overview
66

7-
On Demand Feature Views (ODFVs) allow data scientists to use existing features and request-time data (features only
8-
available at request time) to transform and create new features. Users define Python transformation logic which is
9-
executed during both historical retrieval and online retrieval. Additionally, ODFVs provide flexibility in
10-
applying transformations either during data ingestion (at write time) or during feature retrieval (at read time),
11-
controlled via the `write_to_online_store` parameter.
7+
On Demand Feature Views (ODFVs) allow data scientists to use existing features and request-time data to transform and
8+
create new features. Users define transformation logic that is executed during both historical and online retrieval.
9+
Additionally, ODFVs provide flexibility in applying transformations either during data ingestion (at write time) or
10+
during feature retrieval (at read time), controlled via the `write_to_online_store` parameter.
1211

1312
By setting `write_to_online_store=True`, transformations are applied during data ingestion, and the transformed
1413
features are stored in the online store. This can improve online feature retrieval performance by reducing computation
1514
during reads. Conversely, if `write_to_online_store=False` (the default if omitted), transformations are applied during
1615
feature retrieval.
1716

18-
### Why use on demand feature views?
17+
### Why Use On Demand Feature Views?
1918

20-
This enables data scientists to easily impact the online feature retrieval path. For example, a data scientist could
19+
ODFVs enable data scientists to easily impact the online feature retrieval path. For example, a data scientist could:
2120

22-
1. Call `get_historical_features` to generate a training dataframe
23-
2. Iterate in notebook on feature engineering in Pandas/Python
24-
3. Copy transformation logic into ODFVs and commit to a development branch of the feature repository
25-
4. Verify with `get_historical_features` (on a small dataset) that the transformation gives expected output over historical data
21+
1. Call `get_historical_features` to generate a training dataset.
22+
2. Iterate in a notebook and do your feature engineering using Pandas or native Python.
23+
3. Copy transformation logic into ODFVs and commit to a development branch of the feature repository.
24+
4. Verify with `get_historical_features` (on a small dataset) that the transformation gives the expected output over historical data.
2625
5. Decide whether to apply the transformation on writes or on reads by setting the `write_to_online_store` parameter accordingly.
27-
6. Verify with `get_online_features` on dev branch that the transformation correctly outputs online features
28-
7. Submit a pull request to the staging / prod branches which impact production traffic
26+
6. Verify with `get_online_features` on the development branch that the transformation correctly outputs online features.
27+
7. Submit a pull request to the staging or production branches, impacting production traffic.
2928

30-
## CLI
29+
## Transformation Modes
3130

32-
There are new CLI commands:
31+
When defining an ODFV, you can specify the transformation mode using the `mode` parameter. Feast supports the following modes:
3332

34-
* `feast on-demand-feature-views list` lists all registered on demand feature view after `feast apply` is run
35-
* `feast on-demand-feature-views describe [NAME]` describes the definition of an on demand feature view
33+
- **Pandas Mode (`mode="pandas"`)**: The transformation function takes a Pandas DataFrame as input and returns a Pandas DataFrame as output. This mode is useful for batch transformations over multiple rows.
34+
- **Native Python Mode (`mode="python"`)**: The transformation function uses native Python and can operate on inputs as lists of values or as single dictionaries representing a singleton (single row).
3635

37-
## Example
36+
### Singleton Transformations in Native Python Mode
37+
38+
Native Python mode supports transformations on singleton dictionaries by setting `singleton=True`. This allows you to
39+
write transformation functions that operate on a single row at a time, making the code more intuitive and aligning with
40+
how data scientists typically think about data transformations.
3841

42+
## Example
3943
See [https://github.com/feast-dev/on-demand-feature-views-demo](https://github.com/feast-dev/on-demand-feature-views-demo) for an example on how to use on demand feature views.
4044

41-
### **Registering transformations**
4245

43-
On Demand Transformations support transformations using Pandas and native Python. Note, Native Python is much faster
44-
but not yet tested for offline retrieval.
46+
## Registering Transformations
4547

46-
When defining an ODFV, you can control when the transformation is applied using the write_to_online_store parameter:
48+
When defining an ODFV, you can control when the transformation is applied using the `write_to_online_store` parameter:
4749

4850
- `write_to_online_store=True`: The transformation is applied during data ingestion (on write), and the transformed features are stored in the online store.
49-
- `write_to_online_store=False` (default when omitted): The transformation is applied during feature retrieval (on read).
51+
- `write_to_online_store=False` (default): The transformation is applied during feature retrieval (on read).
5052

51-
We register `RequestSource` inputs and the transform in `on_demand_feature_view`:
53+
### Examples
5254

53-
## Example of an On Demand Transformation on Read
55+
#### Example 1: On Demand Transformation on Read Using Pandas Mode
5456

5557
```python
56-
from feast import Field, RequestSource
58+
from feast import Field, RequestSource, on_demand_feature_view
5759
from feast.types import Float64, Int64
58-
from typing import Any, Dict
5960
import pandas as pd
6061

61-
# Define a request data source which encodes features / information only
62-
# available at request time (e.g. part of the user initiated HTTP request)
62+
# Define a request data source for request-time features
6363
input_request = RequestSource(
6464
name="vals_to_add",
6565
schema=[
66-
Field(name='val_to_add', dtype=Int64),
67-
Field(name='val_to_add_2', dtype=Int64)
68-
]
66+
Field(name="val_to_add", dtype=Int64),
67+
Field(name="val_to_add_2", dtype=Int64),
68+
],
6969
)
7070

71-
# Use the input data and feature view features to create new features Pandas mode
71+
# Use input data and feature view features to create new features in Pandas mode
7272
@on_demand_feature_view(
73-
sources=[
74-
driver_hourly_stats_view,
75-
input_request
76-
],
77-
schema=[
78-
Field(name='conv_rate_plus_val1', dtype=Float64),
79-
Field(name='conv_rate_plus_val2', dtype=Float64)
80-
],
81-
mode="pandas",
73+
sources=[driver_hourly_stats_view, input_request],
74+
schema=[
75+
Field(name="conv_rate_plus_val1", dtype=Float64),
76+
Field(name="conv_rate_plus_val2", dtype=Float64),
77+
],
78+
mode="pandas",
8279
)
8380
def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
8481
df = pd.DataFrame()
85-
df['conv_rate_plus_val1'] = (features_df['conv_rate'] + features_df['val_to_add'])
86-
df['conv_rate_plus_val2'] = (features_df['conv_rate'] + features_df['val_to_add_2'])
82+
df["conv_rate_plus_val1"] = features_df["conv_rate"] + features_df["val_to_add"]
83+
df["conv_rate_plus_val2"] = features_df["conv_rate"] + features_df["val_to_add_2"]
8784
return df
85+
```
86+
87+
#### Example 2: On Demand Transformation on Read Using Native Python Mode (List Inputs)
88+
89+
```python
90+
from feast import Field, on_demand_feature_view
91+
from feast.types import Float64
92+
from typing import Any, Dict
8893

89-
# Use the input data and feature view features to create new features Python mode
94+
# Use input data and feature view features to create new features in Native Python mode
9095
@on_demand_feature_view(
91-
sources=[
92-
driver_hourly_stats_view,
93-
input_request
94-
],
96+
sources=[driver_hourly_stats_view, input_request],
9597
schema=[
96-
Field(name='conv_rate_plus_val1_python', dtype=Float64),
97-
Field(name='conv_rate_plus_val2_python', dtype=Float64),
98+
Field(name="conv_rate_plus_val1_python", dtype=Float64),
99+
Field(name="conv_rate_plus_val2_python", dtype=Float64),
98100
],
99101
mode="python",
100102
)
101103
def transformed_conv_rate_python(inputs: Dict[str, Any]) -> Dict[str, Any]:
102-
output: Dict[str, Any] = {
104+
output = {
103105
"conv_rate_plus_val1_python": [
104106
conv_rate + val_to_add
105-
for conv_rate, val_to_add in zip(
106-
inputs["conv_rate"], inputs["val_to_add"]
107-
)
107+
for conv_rate, val_to_add in zip(inputs["conv_rate"], inputs["val_to_add"])
108108
],
109109
"conv_rate_plus_val2_python": [
110110
conv_rate + val_to_add
111111
for conv_rate, val_to_add in zip(
112112
inputs["conv_rate"], inputs["val_to_add_2"]
113113
)
114-
]
114+
],
115+
}
116+
return output
117+
```
118+
119+
#### **New** Example 3: On Demand Transformation on Read Using Native Python Mode (Singleton Input)
120+
121+
```python
122+
from feast import Field, on_demand_feature_view
123+
from feast.types import Float64
124+
from typing import Any, Dict
125+
126+
# Use input data and feature view features to create new features in Native Python mode with singleton input
127+
@on_demand_feature_view(
128+
sources=[driver_hourly_stats_view, input_request],
129+
schema=[
130+
Field(name="conv_rate_plus_acc_singleton", dtype=Float64),
131+
],
132+
mode="python",
133+
singleton=True,
134+
)
135+
def transformed_conv_rate_singleton(inputs: Dict[str, Any]) -> Dict[str, Any]:
136+
output = {
137+
"conv_rate_plus_acc_singleton": inputs["conv_rate"] + inputs["acc_rate"]
115138
}
116139
return output
117140
```
118141

119-
## Example of an On Demand Transformation on Write
142+
In this example, `inputs` is a dictionary representing a single row, and the transformation function returns a dictionary of transformed features for that single row. This approach is more intuitive and aligns with how data scientists typically process single data records.
143+
144+
#### Example 4: On Demand Transformation on Write Using Pandas Mode
120145

121146
```python
122147
from feast import Field, on_demand_feature_view
@@ -126,22 +151,22 @@ import pandas as pd
126151
# Existing Feature View
127152
driver_hourly_stats_view = ...
128153

129-
# Define an ODFV without RequestSource
154+
# Define an ODFV applying transformation during write time
130155
@on_demand_feature_view(
131156
sources=[driver_hourly_stats_view],
132157
schema=[
133-
Field(name='conv_rate_adjusted', dtype=Float64),
158+
Field(name="conv_rate_adjusted", dtype=Float64),
134159
],
135160
mode="pandas",
136161
write_to_online_store=True, # Apply transformation during write time
137162
)
138163
def transformed_conv_rate(features_df: pd.DataFrame) -> pd.DataFrame:
139164
df = pd.DataFrame()
140-
df['conv_rate_adjusted'] = features_df['conv_rate'] * 1.1 # Adjust conv_rate by 10%
165+
df["conv_rate_adjusted"] = features_df["conv_rate"] * 1.1 # Adjust conv_rate by 10%
141166
return df
142167
```
143-
Then to ingest the data with the new feature view make sure to include all of the input features required for the
144-
transformations:
168+
169+
To ingest data with the new feature view, include all input features required for the transformations:
145170

146171
```python
147172
from feast import FeatureStore
@@ -160,17 +185,17 @@ data = pd.DataFrame({
160185

161186
# Ingest data to the online store
162187
store.push("driver_hourly_stats_view", data)
163-
```
188+
```
164189

165-
### **Feature retrieval**
190+
### Feature Retrieval
166191

167192
{% hint style="info" %}
168-
The on demand feature view's name is the function name (i.e. `transformed_conv_rate`).
193+
**Note**: The name of the on demand feature view is the function name (e.g., `transformed_conv_rate`).
169194
{% endhint %}
170195

171-
172196
#### Offline Features
173-
And then to retrieve historical, we can call this in a feature service or reference individual features:
197+
198+
Retrieve historical features by referencing individual features or using a feature service:
174199

175200
```python
176201
training_df = store.get_historical_features(
@@ -181,14 +206,14 @@ training_df = store.get_historical_features(
181206
"driver_hourly_stats:avg_daily_trips",
182207
"transformed_conv_rate:conv_rate_plus_val1",
183208
"transformed_conv_rate:conv_rate_plus_val2",
209+
"transformed_conv_rate_singleton:conv_rate_plus_acc_singleton",
184210
],
185211
).to_df()
186-
187212
```
188213

189214
#### Online Features
190215

191-
And then to retrieve online, we can call this in a feature service or reference individual features:
216+
Retrieve online features by referencing individual features or using a feature service:
192217

193218
```python
194219
entity_rows = [
@@ -206,6 +231,15 @@ online_response = store.get_online_features(
206231
"driver_hourly_stats:acc_rate",
207232
"transformed_conv_rate_python:conv_rate_plus_val1_python",
208233
"transformed_conv_rate_python:conv_rate_plus_val2_python",
234+
"transformed_conv_rate_singleton:conv_rate_plus_acc_singleton",
209235
],
210236
).to_dict()
211237
```
238+
239+
## CLI Commands
240+
There are new CLI commands to manage on demand feature views:
241+
242+
feast on-demand-feature-views list: Lists all registered on demand feature views after feast apply is run.
243+
feast on-demand-feature-views describe [NAME]: Describes the definition of an on demand feature view.
244+
245+

0 commit comments

Comments
 (0)