Skip to main content

No project description provided

Project description

Feast Teradata Connector

feast-teradata tests

Overview

We recommend you familiarize yourself with the terminology and concepts of feast by reading the official feast documentation.

The feast-teradata library adds support for Teradata as

  • OfflineStore
  • OnlineStore

Additional, using Teradata as the registry (catalog) is already supported via the registry_type: sql and included in our examples. This means that everything is located in Teradata. However, depending on the requirements, installation, etc, this can be mixed and matched with other systems as appropriate.

Getting Started

To get started, install the feast-teradata library

pip install feast-teradata

Let's create a simple feast setup with Teradata using the standard drivers dataset. Note that you cannot use feast init as this command only works for templates which are part of the core feast library. We intend on getting this library merged into feast core eventually but for now, you will need to use the following cli command for this specific task. All other feast cli commands work as expected.

feast-td init-repo

This will then prompt you for the required information for the Teradata system and upload the example dataset. Let's assume you used the repo name demo when running the above command. You can find the repository files along with a file called test_workflow.py. Running this test_workflow.py will execute a complete workflow for feast with Teradata as the Registry, OfflineStore and OnlineStore.

demo/
    feature_repo/
        driver_repo.py
        feature_store.yml
    test_workflow.py

From within the demo/feature_repo directory, execute the following feast command to apply (import/update) the repo definition into the registry. You will be able to see the registry metadata tables in the teradata database after running this command.

feast apply

To see the registry information in the feast ui, run the following command. Note the --registry_ttl_sec is important as by default it polls every 5 seconds.

feast ui --registry_ttl_sec=120

Example Usage

Now, lets batch read some features for training, using only entities (population) for which we have seen an event for in the last 60 days. The predicates (filter) used can be on anything that is relevant for the entity (population) selection for the given training dataset. The event_timestamp is only for example purposes.

from feast import FeatureStore


store = FeatureStore(repo_path="feature_repo")

training_df = store.get_historical_features(
    entity_df=f"""
            SELECT
                driver_id,
                event_timestamp
            FROM demo_feast_driver_hourly_stats
            WHERE event_timestamp BETWEEN (CURRENT_TIMESTAMP - INTERVAL '60' DAY) AND CURRENT_TIMESTAMP
        """,
    features=[
        "driver_hourly_stats:conv_rate",
        "driver_hourly_stats:acc_rate",
        "driver_hourly_stats:avg_daily_trips"
    ],
).to_df()
print(training_df.head())

The feast-teradata library allows you to use the complete set of feast APIs and functionality. Please refer to the official feast quickstart for more details on the various things you can do.

Additionally, if you want to see a complete (but not real-world), end-to-end example workflow example, see the demo/test_workflow.py script. This is used for testing the complete feast functionality.

Repo Configuration

A feast repository is configured via the feature_store.yaml. There are 3 sections in this that can be configured to use Teradata

  • Registry
  • OfflineStore
  • OnlineStore

To configure Teradata as the OnlineStore, use the following configuration

online_store:
    type: feast_teradata.online.teradata.TeradataOnlineStore
    host: <host>
    database: <db>
    user: <user>
    password: <password>
    log_mech: <TDNEGO|LDAP|etc>

To configure Teradata as the OfflineStore, use the following configuration

offline_store:
    type: feast_teradata.offline.teradata.TeradataOfflineStore
    host: <host>
    database: <db>
    user: <user>
    password: <password>
    log_mech: <TDNEGO|LDAP|etc>

To configure Teradata as the Registry, configure the registry_type as sql and the path as the sqlalchemy url for teradata as follows

registry:
    registry_type: sql
    path: teradatasql://<user>:<password>@<host>/?database=<database>&LOGMECH=<TDNEGO|LDAP|etc>
    cache_ttl_seconds: 120

Release Notes

1.0.4

  • Update: bump Feast dependency to 0.31.1

1.0.3

  • Fix: Added string mapping for columns.

1.0.2

  • Doc: Improve README with details on repo configuration
  • Fix: Fix Github Release on CI Release
  • Fix: Updated path variable to become OS independent.

1.0.1

  • Doc: Improve README with better getting started information.
  • Fix: Remove pytest from requirements.txt
  • Fix: Set minimum python version to 3.8 due to feast dependency on pandas>=1.4.3
  • Fix: Updated feast-td types conversion

1.0.0

  • Feature: Initial implementation of feast-teradata library

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feast-teradata-1.0.4.tar.gz (24.9 kB view details)

Uploaded Source

Built Distribution

feast_teradata-1.0.4-py3-none-any.whl (27.6 kB view details)

Uploaded Python 3

File details

Details for the file feast-teradata-1.0.4.tar.gz.

File metadata

  • Download URL: feast-teradata-1.0.4.tar.gz
  • Upload date:
  • Size: 24.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for feast-teradata-1.0.4.tar.gz
Algorithm Hash digest
SHA256 f5586cd43408feecf16721b726451a7953aef945cb35a00a7e5a1551e5853715
MD5 93a6ebc84f8ba1de4958015a8881dbf9
BLAKE2b-256 e29b80354c79909b8770f71f9c2671cff2d39dd61ed44d36bded26ea7553e901

See more details on using hashes here.

File details

Details for the file feast_teradata-1.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for feast_teradata-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 c7e5d409c7b105f16f3bd767b66d0313f1a7b5f9524301bd0e86e921da1fd85d
MD5 a66af6f150904562bf8b95c64091fb9f
BLAKE2b-256 932880067e87716e6739af37f7205266d816309637f4adcf1aaaf3f2638e06d8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page