Skip to main content

Python Client for Orion Feature Store to push/produce Model Features and get features' metadata

Project description

Orion Python Client

A lightweight Python client for interacting with Orion Feature Store. This client provides functionality for feature metadata retrieval, protobuf serialization, and Kafka integration. 🚀

This client helps in pushing ML model's features stored in offline sources (like tables, Cloud storage objects in parquet/delta format, etc) to Orion Feature Store

Key Features

  • Feature metadata retrieval
  • Protobuf serialization of feature values and produce to Apache Kafka
  • Support for features of different various data types:
    • Scalar types (FP32, FP64, Int32, Int64, UInt32, UInt64, String, Bool)
    • Vector types (Vectors of each of the above Scalar Types)
  • Kafka integration with configurable settings

📥 Installation

pip install orion-py-client==0.1.1

Prerequisites

  • Python 3.7+
  • (Optional) Apache Spark 3.0+ & spark-sql-kafka for Kafka feature push functionality

Usage

Basic Usage

from orion_py_client import OrionPyClient

# Initialize the client
client = OrionPyClient(
    features_metadata_source_url="your_features_metadata_source_url",
    job_id="your_job_id",
    job_token="your_job_token"
)

# Get feature details
(
    offline_src_type_columns,
    offline_col_to_default_values_map,
    entity_column_names
) = opy_client.get_features_details()

Push Feature Values from Offline sources to Orion via Spark -> Kafka

Supported Offline Sources

  1. Table (Hive/Delta)
  2. Parquet folder stored in Cloud Storage (AWS/GCS/ADLS)
  3. Delta folder stored in Cloud Storage (AWS/GCS/ADLS)

Refer to the examples for detailed example of how to configure a job and push the feature values

Followng is a simple flow / outline of the steps involved in above example

# create a new orion client
opy_client = OrionPyClient(features_metadata_source_url, job_id, job_token) 

# get the features details
feature_mapping, offline_col_to_default_values_map, onfs_fg_to_onfs_feat_map, onfs_fg_to_ofs_feat_map, fg_to_datatype_map, entity_label, entity_column_names = opy_client.get_features_details(fgs_to_consider)

# read the data from different sources
df = get_features_from_all_sources(spark, entity_column_names, feature_mapping, offline_col_to_default_values_map)

# serialize of protobuf binary
proto_df = opy_client.generate_df_with_protobuf_messages(df, intra_batch_size=20) 

# Produce data to kafka so that consumers write features to Orion Feature Store
opy_client.write_protobuf_df_to_kafka(proto_df, kafka_bootstrap_servers, kafka_topic, additional_options)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For support, please create an issue

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orion_py_client-0.1.12.tar.gz (8.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orion_py_client-0.1.12-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file orion_py_client-0.1.12.tar.gz.

File metadata

  • Download URL: orion_py_client-0.1.12.tar.gz
  • Upload date:
  • Size: 8.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.5

File hashes

Hashes for orion_py_client-0.1.12.tar.gz
Algorithm Hash digest
SHA256 9f80373fbffabace6051c9633f51ea5ec09b770bd4e42d11576a9846c408e4be
MD5 767c2b518b0b23e881b23d0e5532098c
BLAKE2b-256 25a60091da80e2dddafe590e78d75ff630e758cd39b3a2071504d959534cfd78

See more details on using hashes here.

File details

Details for the file orion_py_client-0.1.12-py3-none-any.whl.

File metadata

File hashes

Hashes for orion_py_client-0.1.12-py3-none-any.whl
Algorithm Hash digest
SHA256 1271c6f592ee5a50f0684f4c5fda32150118ece23a2adfdbe09af88b1cee6ffe
MD5 6500531ede5bc97e366c8a0c39d59c81
BLAKE2b-256 783156d6510352b5611c59d4bc8de5f75778954330c6bd5eedd36b586dde2ae7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page