Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions docs/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@
* [Load data into the online store](how-to-guides/feast-snowflake-gcp-aws/load-data-into-the-online-store.md)
* [Read features from the online store](how-to-guides/feast-snowflake-gcp-aws/read-features-from-the-online-store.md)
* [Running Feast in production](how-to-guides/running-feast-in-production.md)
* [Deploying a Java feature server on Kubernetes](how-to-guides/fetching-java-features-k8s.md)
* [Upgrading from Feast 0.9](https://docs.google.com/document/u/1/d/1AOsr\_baczuARjCpmZgVd8mCqTF4AZ49OEyU4Cn-uTT0/edit)
* [Adding a custom provider](how-to-guides/creating-a-custom-provider.md)
* [Adding a custom batch materialization engine](how-to-guides/creating-a-custom-materialization-engine.md)
Expand Down Expand Up @@ -93,7 +92,7 @@
* [.feastignore](reference/feature-repository/feast-ignore.md)
* [Feature servers](reference/feature-servers/README.md)
* [Python feature server](reference/feature-servers/python-feature-server.md)
* [Go-based feature retrieval](reference/feature-servers/go-feature-retrieval.md)
* [Go feature server](reference/feature-servers/go-feature-server.md)
* [\[Alpha\] Web UI](reference/alpha-web-ui.md)
* [\[Alpha\] Data quality monitoring](reference/dqm.md)
* [\[Alpha\] On demand feature view](reference/alpha-on-demand-feature-view.md)
Expand Down
15 changes: 0 additions & 15 deletions docs/how-to-guides/fetching-java-features-k8s.md

This file was deleted.

24 changes: 12 additions & 12 deletions docs/how-to-guides/running-feast-in-production.md
Original file line number Diff line number Diff line change
Expand Up @@ -242,14 +242,12 @@ This service will provide an HTTP API with JSON I/O, which can be easily used wi

[Read more about this feature](../reference/alpha-aws-lambda-feature-server.md)

### 4.3. Java based Feature Server deployed on Kubernetes
### 4.3. Go feature server deployed on Kubernetes

For users with very latency-sensitive and high QPS use-cases, Feast offers a high-performance Java feature server.
Besides the benefits of running on JVM, this implementation also provides a gRPC API, which guarantees good connection utilization and
small request / response body size (compared to JSON).
You will need the Feast Java SDK to retrieve features from this service. This SDK wraps all the gRPC logic for you and provides more convenient APIs.
For users with very latency-sensitive and high QPS use-cases, Feast offers a high-performance [Go feature server](../reference/feature-servers/go-feature-server.md).
It can use either HTTP or gRPC.

The Java based feature server can be deployed to Kubernetes cluster via Helm charts in a few simple steps:
The Go feature server can be deployed to a Kubernetes cluster via Helm charts in a few simple steps:

1. Install [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) and [helm 3](https://helm.sh/)
2. Add the Feast Helm repository and download the latest charts:
Expand All @@ -259,17 +257,19 @@ helm repo update
```
3. Run Helm Install
```
helm install feast-release feast-charts/feast \
helm install feast-release feast-charts/feast-python-server \
--set global.registry.path=s3://feast/registries/prod \
--set global.project=<project name>
```

This chart will deploy two services: `feature-server` and `transformation-service`.
Both must have read access to the registry file on cloud storage. Both will keep a copy of the registry in their memory and periodically refresh it, so expect some delays in update propagation in exchange for better performance.
This chart will deploy a single service.
The service must have read access to the registry file on cloud storage.
It will keep a copy of the registry in their memory and periodically refresh it, so expect some delays in update propagation in exchange for better performance.
In order for the Go feature server to be enabled, you should set `go_feature_serving: True` in the `feature_store.yaml`.

#### Load balancing

The next step would be to install an L7 Load Balancer (eg, [Envoy](https://www.envoyproxy.io/)) in front of the Java feature server.
The next step would be to install an L7 Load Balancer (eg, [Envoy](https://www.envoyproxy.io/)) in front of the Go feature server.
For seamless integration with Kubernetes (including services created by Feast Helm chart) we recommend using [Istio](https://istio.io/) as Envoy's orchestrator.

## 5. Ingesting features from a stream source
Expand Down Expand Up @@ -344,8 +344,8 @@ Summarizing it all together we want to show several options of architecture that
* Feast SDK is being triggered by CI (eg, Github Actions). It applies the latest changes from the feature repo to the Feast registry
* Airflow manages materialization jobs to ingest data from DWH to the online store periodically
* For the stream ingestion Feast Python SDK is used in the existing Spark / Beam pipeline
* Online features are served via either a Python feature server or a high performance Java feature server
* Both the Java feature server and the transformation server are deployed on Kubernetes cluster (via Helm charts)
* Online features are served via either a Python feature server or a high performance Go feature server
* The Go feature server can be deployed on a Kubernetes cluster (via Helm charts)
* Feast Python SDK is called locally to generate a training dataset

![From Repository to Production: Feast Production Architecture](production-spark.png)
Expand Down
85 changes: 0 additions & 85 deletions docs/reference/feature-servers/go-feature-retrieval.md

This file was deleted.

93 changes: 93 additions & 0 deletions docs/reference/feature-servers/go-feature-server.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Go feature server

## Overview

The Go feature server is an HTTP/gRPC endpoint that serves features.
It is written in Go, and is therefore significantly faster than the Python feature server.
See this [blog post](https://feast.dev/blog/go-feature-server-benchmarks/) for more details on the comparison between Python and Go.
In general, we recommend the Go feature server for all production use cases that require extremely low-latency feature serving.
Currently only the Redis and SQLite online stores are supported.

## CLI

By default, the Go feature server is turned off.
To turn it on you can add `go_feature_serving: True` to your `feature_store.yaml`:

{% code title="feature_store.yaml" %}
```yaml
project: my_feature_repo
registry: data/registry.db
provider: local
online_store:
type: redis
connection_string: "localhost:6379"
go_feature_serving: True
```
{% endcode %}

Then the `feast serve` CLI command will start the Go feature server.
As with Python, the Go feature server uses port 6566 by default; the port be overridden with a `--port` flag.
Moreover, the server uses HTTP by default, but can be set to use gRPC with `--type=grpc`.

Alternatively, if you wish to experiment with the Go feature server instead of permanently turning it on, you can just run `feast serve --go`.

## Installation

The Go component comes pre-compiled when you install Feast with Python versions 3.8-3.10 on macOS or Linux (on x86).
In order to install the additional Python dependencies, you should install Feast with
```
pip install feast[go]
```
You must also install the Apache Arrow C++ libraries.
This is because the Go feature server uses the cgo memory allocator from the Apache Arrow C++ library for interoperability between Go and Python, to prevent memory from being accidentally garbage collected when executing on-demand feature views.
You can read more about the usage of the cgo memory allocator in these [docs](https://pkg.go.dev/github.com/apache/arrow/go/[email protected]/cdata#ExportArrowRecordBatch).

For macOS, run `brew install apache-arrow`.
For linux users, you have to install `libarrow-dev`.
```
sudo apt update
sudo apt install -y -V ca-certificates lsb-release wget
wget https://apache.jfrog.io/artifactory/arrow/$(lsb_release --id --short | tr 'A-Z' 'a-z')/apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt install -y -V ./apache-arrow-apt-source-latest-$(lsb_release --codename --short).deb
sudo apt update
sudo apt install -y -V libarrow-dev # For C++
```
For developers, if you want to build from source, run `make compile-go-lib` to build and compile the go server. In order to build the go binaries, you will need to install the `apache-arrow` c++ libraries.

## Alpha features

### Feature logging

The Go feature server can log all requested entities and served features to a configured destination inside an offline store.
This allows users to create new datasets from features served online. Those datasets could be used for future trainings or for
feature validations. To enable feature logging we need to edit `feature_store.yaml`:
```yaml
project: my_feature_repo
registry: data/registry.db
provider: local
online_store:
type: redis
connection_string: "localhost:6379"
go_feature_serving: True
feature_server:
feature_logging:
enable: True
```

Feature logging configuration in `feature_store.yaml` also allows to tweak some low-level parameters to achieve the best performance:
```yaml
feature_server:
feature_logging:
enable: True
flush_interval_secs: 300
write_to_disk_interval_secs: 30
emit_timeout_micro_secs: 10000
queue_capacity: 10000
```
All these parameters are optional.

### Python SDK retrieval

The logic for the Go feature server can also be used to retrieve features during a Python `get_online_features` call.
To enable this behavior, you must add `go_feature_retrieval: True` to your `feature_store.yaml`.
You must also have all the dependencies installed as detailed above.
29 changes: 13 additions & 16 deletions docs/reference/feature-servers/python-feature-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,22 @@

## Overview

The feature server is an HTTP endpoint that serves features with JSON I/O. This enables users to write + read features from Feast online stores using any programming language that can make HTTP requests.
The Python feature server is an HTTP endpoint that serves features with JSON I/O. This enables users to write and read features from the online store using any programming language that can make HTTP requests.

## CLI

There is a CLI command that starts the server: `feast serve`. By default, Feast uses port 6566; the port be overridden by a `--port` flag.
There is a CLI command that starts the server: `feast serve`. By default, Feast uses port 6566; the port be overridden with a `--port` flag.

## Deploying as a service

One can also deploy a feature server by building a docker image that bundles in the project's `feature_store.yaml`. See [helm chart](https://github.com/feast-dev/feast/blob/master/infra/charts/feast-python-server) for example.

A [remote feature server](../alpha-aws-lambda-feature-server.md) on AWS Lambda is available. A remote feature server on GCP Cloud Run is currently being developed.
One can deploy a feature server by building a docker image that bundles in the project's `feature_store.yaml`. See this [helm chart](https://github.com/feast-dev/feast/blob/master/infra/charts/feast-python-server) for an example.

A [remote feature server](../alpha-aws-lambda-feature-server.md) on AWS Lambda is also available.

## Example

### Initializing a feature server
Here's the local feature server usage example with the local template:
Here's an example of how to start the Python feature server with a local feature repo:

```bash
$ feast init feature_repo
Expand All @@ -27,9 +26,11 @@ Creating a new Feast repository in /home/tsotne/feast/feature_repo.
$ cd feature_repo

$ feast apply
Registered entity driver_id
Registered feature view driver_hourly_stats
Deploying infrastructure for driver_hourly_stats
Created entity driver
Created feature view driver_hourly_stats
Created feature service driver_activity

Created sqlite table feature_repo_driver_hourly_stats

$ feast materialize-incremental $(date +%Y-%m-%d)
Materializing 1 feature views to 2021-09-09 17:00:00-07:00 into the sqlite online store.
Expand All @@ -38,8 +39,6 @@ driver_hourly_stats from 2021-09-09 16:51:08-07:00 to 2021-09-09 17:00:00-07:00:
100%|████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 295.24it/s]

$ feast serve
This is an experimental feature. It's intended for early testing and feedback, and could change without warnings in future releases.
INFO: Started server process [8889]
09/10/2021 10:42:11 AM INFO:Started server process [8889]
INFO: Waiting for application startup.
09/10/2021 10:42:11 AM INFO:Waiting for application startup.
Expand All @@ -49,7 +48,7 @@ INFO: Uvicorn running on http://127.0.0.1:6566 (Press CTRL+C to quit)
09/10/2021 10:42:11 AM INFO:Uvicorn running on http://127.0.0.1:6566 (Press CTRL+C to quit)
```

### Retrieving features from the online store
### Retrieving features
After the server starts, we can execute cURL commands from another terminal tab:

```bash
Expand Down Expand Up @@ -153,11 +152,9 @@ curl -X POST \
```

### Pushing features to the online and offline stores
You can push data corresponding to a push source to the online and offline stores (note that timestamps need to be strings):

You can also define a pushmode to push stream or batch data, either to the online store, offline store, or both. The feature server will throw an error if the online/offline store doesn't support the push api functionality.
The Python feature server also exposes an endpoint for [push sources](../../data-sources/push.md). This endpoint allows you to push data to the online and/or offline store.

The request definition for pushmode is a string parameter `to` where the options are: ["online", "offline", "online_and_offline"].
The request definition for pushmode is a string parameter `to` where the options are: ["online", "offline", "online_and_offline"]. Note that timestamps need to be strings.
```text
curl -X POST "http://localhost:6566/push" -d '{
"push_source_name": "driver_hourly_stats_push_source",
Expand Down
Loading