Skip to main content

Python Client for Orion Feature Store to push/produce Model Features and get features' metadata

Project description

Orion Python Client

A lightweight Python client for interacting with Orion Feature Store. This client provides functionality for feature metadata retrieval, protobuf serialization, and Kafka integration. 🚀

This client helps in pushing ML model's features stored in offline sources (like tables, Cloud storage objects in parquet/delta format, etc) to Orion Feature Store

Key Features

  • Feature metadata retrieval
  • Protobuf serialization of feature values and produce to Apache Kafka
  • Support for features of different various data types:
    • Scalar types (FP32, FP64, Int32, Int64, UInt32, UInt64, String, Bool)
    • Vector types (Vectors of each of the above Scalar Types)
  • Kafka integration with configurable settings

📥 Installation

pip install orion-py-client==0.1.1

Prerequisites

  • Python 3.7+
  • (Optional) Apache Spark 3.0+ & spark-sql-kafka for Kafka feature push functionality

Usage

Basic Usage

from orion_py_client import OrionPyClient

# Initialize the client
client = OrionPyClient(
    features_metadata_source_url="your_features_metadata_source_url",
    job_id="your_job_id",
    job_token="your_job_token"
)

# Get feature details
(
    offline_src_type_columns,
    offline_col_to_default_values_map,
    entity_column_names
) = opy_client.get_features_details()

Push Feature Values from Offline sources to Orion via Spark -> Kafka

Supported Offline Sources

  1. Table (Hive/Delta)
  2. Parquet folder stored in Cloud Storage (AWS/GCS/ADLS)
  3. Delta folder stored in Cloud Storage (AWS/GCS/ADLS)

Refer to the examples for detailed example of how to configure a job and push the feature values

Followng is a simple flow / outline of the steps involved in above example

# create a new orion client
opy_client = OrionPyClient(features_metadata_source_url, job_id, job_token) 

# get the features details
feature_mapping, offline_col_to_default_values_map, onfs_fg_to_onfs_feat_map, onfs_fg_to_ofs_feat_map, fg_to_datatype_map, entity_label, entity_column_names = opy_client.get_features_details(fgs_to_consider)

# read the data from different sources
df = get_features_from_all_sources(spark, entity_column_names, feature_mapping, offline_col_to_default_values_map)

# serialize of protobuf binary
proto_df = opy_client.generate_df_with_protobuf_messages(df, intra_batch_size=20) 

# Produce data to kafka so that consumers write features to Orion Feature Store
opy_client.write_protobuf_df_to_kafka(proto_df, kafka_bootstrap_servers, kafka_topic, additional_options)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For support, please create an issue

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

orion_py_client-0.1.13.tar.gz (8.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

orion_py_client-0.1.13-py3-none-any.whl (11.1 kB view details)

Uploaded Python 3

File details

Details for the file orion_py_client-0.1.13.tar.gz.

File metadata

  • Download URL: orion_py_client-0.1.13.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.2

File hashes

Hashes for orion_py_client-0.1.13.tar.gz
Algorithm Hash digest
SHA256 e6b65cd0a0ffd0f79b1e3e49e732bb8f20128dcbfd50e870799e00d4168d7266
MD5 2f8d020ee318b3762684adf82576677f
BLAKE2b-256 57684fe42ee9bd98d35e97de40ca2f7bb4bae6420cbb11f31dfdeae807c42944

See more details on using hashes here.

File details

Details for the file orion_py_client-0.1.13-py3-none-any.whl.

File metadata

File hashes

Hashes for orion_py_client-0.1.13-py3-none-any.whl
Algorithm Hash digest
SHA256 c889e3877bc1668bbc24eaca4a14d6e8e5425b087ef38373461757ad3d9924d2
MD5 66091a850b90112a9dfb73d990c244c6
BLAKE2b-256 c3b3f76c92a3a990470b25aecb51adaf56284f15b2f106ab25750b9f525cad0f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page