ML Platform for your local machine using cheap cloud services for scalable resources.
Project description
mlforge
A simple feature store SDK for machine learning workflows.
Installation
pip install mlforge-sdk
Or with uv:
uv add mlforge-sdk
Quick Start
Define features using the @feature decorator:
import polars as pl
from mlforge import feature, entity_key
# Create reusable entity key transforms
with_user_id = entity_key("first", "last", "dob", alias="user_id")
@feature(
keys=["user_id"],
source="data/transactions.parquet",
description="Total spend by user"
)
def user_total_spend(df: pl.DataFrame) -> pl.DataFrame:
return (
df.pipe(with_user_id)
.group_by("user_id")
.agg(pl.col("amt").sum().alias("total_spend"))
)
Register features and materialize them:
from mlforge import Definitions, LocalStore
import my_features
defs = Definitions(
name="my-project",
features=[my_features],
offline_store=LocalStore("./feature_store")
)
# Materialize features to storage
defs.materialize()
Retrieve features for training:
from mlforge import get_training_data
training_df = get_training_data(
features=["user_total_spend"],
entity_df=transactions,
entities=[with_user_id],
timestamp="trans_date_trans_time" # Point-in-time correct joins
)
Features
- Simple API: Define features with a
@featuredecorator - Entity Keys: Generate surrogate keys from natural keys using
entity_key() - Local Storage: Persist features to Parquet with
LocalStore - Point-in-Time Joins: Retrieve training data with temporal correctness using
get_training_data() - CLI: Build and list features from the command line
CLI Usage
Build features:
mlforge build definitions.py
Build specific features:
mlforge build definitions.py --features user_total_spend,merchant_total_spend
List registered features:
mlforge list definitions.py
Documentation
Full documentation is available at https://chonalchendo.github.io/mlforge
Contributing
Contributions are welcome! Please see the repository for development setup and guidelines.
License
MIT License - see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mlforge_sdk-0.3.0-py3-none-any.whl.
File metadata
- Download URL: mlforge_sdk-0.3.0-py3-none-any.whl
- Upload date:
- Size: 20.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4dc1b7aa1d49d89a76c6154d64f51ccdf8e42b564e2380813518580504857c8
|
|
| MD5 |
5b53768fbb90c30be57ba35c9d78a310
|
|
| BLAKE2b-256 |
32a23da050ad518ac6101bc92e5c0f912cb56de97ad9275a6f8fa2c98eccb040
|
Provenance
The following attestation bundles were made for mlforge_sdk-0.3.0-py3-none-any.whl:
Publisher:
publish.yaml on chonalchendo/mlforge
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
mlforge_sdk-0.3.0-py3-none-any.whl -
Subject digest:
c4dc1b7aa1d49d89a76c6154d64f51ccdf8e42b564e2380813518580504857c8 - Sigstore transparency entry: 767792608
- Sigstore integration time:
-
Permalink:
chonalchendo/mlforge@ae6bc0b56b9dcba8f38d5755c035db53b5165efd -
Branch / Tag:
refs/tags/v0.3.0 - Owner: https://github.com/chonalchendo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yaml@ae6bc0b56b9dcba8f38d5755c035db53b5165efd -
Trigger Event:
push
-
Statement type: