Skip to main content

Feast offline feature store implementation backed by the IBM Cloud Data Engine

Project description

Feast plugin for IBM Cloud.
This plugin implements Feast's offline store backed using IBM Cloud Data Engine and IBM Cloud Object Storage

Installation

Project dependencies can be installed in a dedicated virtual environment by running the following command:

poetry install

Testing and Linting

poetry run pytest tests/
poetry run pylint ibm_data_engine

Test with Feast

You use it with Feast by defining your offline store and data sources. The instructions below illustrate how it can be used in feast-ibm-quickstart.

Define dependency

This library is currently not published in PyPI; you will have to point to the repository directly. The easiest way to do it is to clone the repository, and define the dependency as a path in feast-ibm-quickstart.

ibm-data-engine = { path = "/path/to/ibm-data-engine" }

After running poetry update, you should be able to use the IBM Cloud Data Engine offline store.

Define data source

You can modify the src/feature_repo/example_repo.py file to use the new data source. Below is the minimal example of the file:

from ibm_data_engine import DataEngineDataSource
driver_stats_source = DataEngineDataSource(
    name="driver_hourly_stats_source",
    table="driver_stats_demo",
    timestamp_field="event_timestamp",
)

Define offline store

Then, feature_repo/feature_store.yaml must configure the offline store.

project: test_plugin
entity_key_serialization_version: 2
registry: data/registry.db
provider: local
online_store:
    type: redis
    connection_string: ${REDIS_HOST}:${REDIS_PORT},username=${REDIS_USERNAME},password=${REDIS_PASSWORD},ssl=true,ssl_ca_certs=${REDIS_CERT_PATH},db=0

offline_store:
    type: ibm_data_engine.DataEngineOfflineStore
    api_key: ${DATA_ENGINE_API_KEY}
    instance_crn: ${DATA_ENGINE_INSTANCE_CRN}
    target_cos_url: ${IBM_CLOUD_OBJECT_STORE_URL}

Notice that you must define the environment variables:

  • IBM_CLOUD_OBJECT_STORE_URL
  • REDIS_HOST
  • REDIS_PORT
  • REDIS_PASSWORD
  • REDIS_CERT_PATH
  • DATA_ENGINE_API_KEY
  • DATA_ENGINE_INSTANCE_CRN

Apply

To apply the definitions to the registry, run:

poetry run feast -c ./feature_repo apply

Training

Run training by retrieving historical feature information from feature store

poetry run python training.py

Materialize

To materialize to Redis, run:

poetry run feast -c ./ materialize '<START_TIMESTAMP>'  '<END_TIMESTAMP>'

Inference

poetry run python inference.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ibm_data_engine-0.3.0.tar.gz (14.4 kB view hashes)

Uploaded Source

Built Distribution

ibm_data_engine-0.3.0-py3-none-any.whl (14.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page