Lightweight framework for building data and ML workflows with class-based Python syntax
Project description
AXL Workflows (axl) is a lightweight framework for building data and ML workflows with a class-based Python syntax. Build a workflow once, then run it locally or on Argo/Kubeflow:
- Local runtime → fast iteration on your machine.
- Argo Workflows YAML → run on Kubernetes; compatible with Kubeflow Pipelines (KFP) environments.
Write once → run anywhere (locally or Argo/Kubeflow in production).
🚀 Quick Start
# Install
pip install axl-workflows
# Or with uv
uv pip install axl-workflows
# Create your first workflow
axl --help
✨ Key Features
-
Class-based DSL: Define workflows as Python classes, with steps as methods and a
graph()to wire them. -
Simple params: Treat parameters as a normal step that returns a Python object (e.g., a Pydantic model or dict). No special Param/Artifact classes.
-
IO Handlers: Steps return plain Python objects; axl persists/loads them via an
io_handler(default: pickle).- Per-step override (
@step(io_handler=...)) - Input modes: receive objects by default or file paths with
input_mode="path".
- Per-step override (
-
Intermediate Representation (IR): Backend-agnostic DAG model (nodes, edges, resources, IO metadata).
-
Multiple backends:
- Local runtime → develop and iterate quickly.
- Argo/KFP → YAML generation for production pipelines.
-
Unified runner image: One container executes steps locally and in Argo pods.
-
Resource & retry hints: Declare CPU, memory, caching, retries, and conditions at the step level.
-
CLI tools: Compile, validate, run locally, or render DAGs.
📦 Example Workflow (params as a step, with Pydantic)
# examples/churn_workflow.py
from axl import Workflow, step
from pydantic import BaseModel
# Parameters are just a normal step output (typed with Pydantic for convenience).
class TrainParams(BaseModel):
seed: int = 42
input_path: str = "data/raw.csv"
class ChurnTrain(Workflow):
# Workflow configuration via class attributes
name = "churn-train"
image = "ghcr.io/you/axl-runner:0.1.0"
io_handler = "pickle"
@step
def params(self) -> TrainParams:
# Use defaults here; optionally read from YAML/env if you prefer.
return TrainParams()
@step # default io_handler = pickle
def preprocess(self, p: TrainParams):
import pandas as pd
df = pd.read_csv(p.input_path)
# ... feature engineering ...
return df # persisted via pickle (default)
@step
def train(self, features, p: TrainParams):
from sklearn.ensemble import RandomForestClassifier
import numpy as np
y = (features.sum(axis=1) > features.sum(axis=1).median()).astype(int)
X = features.select_dtypes(include=[np.number]).fillna(0)
model = RandomForestClassifier(n_estimators=50, random_state=p.seed).fit(X, y)
return model # persisted via pickle
@step
def evaluate(self, model) -> float:
# pretend evaluation
return 0.9123
def graph(self):
p = self.params()
feats = self.preprocess(p)
model = self.train(feats, p)
return self.evaluate(model)
Variations
-
Receive a file path instead of an object:
from pathlib import Path @step(input_mode={"features": "path"}) def profile(self, features: Path) -> dict: return {"bytes": Path(features).stat().st_size}
-
Override the io handler (e.g., Parquet for DataFrames):
from axl.io.parquet_io import parquet_io_handler @step(io_handler=parquet_io_handler) def preprocess(self, p: TrainParams): import pandas as pd return pd.read_csv(p.input_path) # saved as .parquet; downstream gets a DataFrame
🛠 CLI
# Compile to Argo YAML
axl compile -m examples/churn_workflow.py:ChurnTrain --target argo --out churn.yaml
# Compile to Dagster job (Python module output)
axl compile -m examples/churn_workflow.py:ChurnTrain --target dagster --out dagster_job.py
# Run locally
axl run local -m examples/churn_workflow.py:ChurnTrain
# Validate workflow definition
axl validate -m examples/churn_workflow.py:ChurnTrain
# Render DAG graph
axl render -m examples/churn_workflow.py:ChurnTrain --out dag.png
📐 Architecture
-
Authoring Layer
- Python DSL:
@stepdecorator,Workflowbase class - Params are a normal step (often a Pydantic model)
- Configuration via class attributes (name, image, io_handler)
- IO handled by io_handlers (default: pickle)
- Wire dependencies via
graph()
- Python DSL:
-
IR (Intermediate Representation)
- Abstract DAG: nodes, edges, inputs/outputs, resources, retry policies, IO metadata
-
Compilers
- Argo: generates Argo Workflow YAML and run at Argo Workflows
- Kubeflow: Compile to pipelines YAML and run it on Kubeflow pipelines
-
Runtime
- Unified runner image (
axl-runner) executes steps - Handles env (via uv), IO handler save/load, logging, retries
- Unified runner image (
-
CLI
- Single interface for compile, run, validate, render
📂 Project Structure
axl/
core/ # DSL: decorators, base classes, typing
io/ # io_handlers (pickle default; parquet/npy/torch optional)
ir/ # Intermediate Representation (nodes, edges, workflows)
compiler/ # Backend compilers (Argo, Kubeflow)
runtime/ # Runner container + IO + env setup (uv)
cli.py # CLI entrypoint
examples/
churn_workflow.py
tests/
test_core.py # Tests for DSL components
test_ir.py # Tests for IR components
pyproject.toml
README.md
🎯 Why AXL Workflows?
-
Local development is fast and simple.
-
Kubeflow Pipelines/Argo is production-grade is production‑grade but YAML is verbose and may harder to getting started.
-
axl bridges the gap:
- Simple, class-based DSL
- Params as a normal step
- IO handlers for painless object ↔ file persistence
- Backend-agnostic IR
- Compile once, run anywhere
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file axl_workflows-0.2.0.tar.gz.
File metadata
- Download URL: axl_workflows-0.2.0.tar.gz
- Upload date:
- Size: 50.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f2c8a5a3a8b56f45b102746d4bdafcadcfe5e206d22dcd7d7956759c405cba48
|
|
| MD5 |
5c2cac0aee3c0b1b784c08d6dc5edfab
|
|
| BLAKE2b-256 |
c46ca214c1d02a4ef379b48a6d27e59c12b1615cebc7207ef0ce43f99b9d8603
|
Provenance
The following attestation bundles were made for axl_workflows-0.2.0.tar.gz:
Publisher:
release.yml on pedrospinosa/axl-workflows
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
axl_workflows-0.2.0.tar.gz -
Subject digest:
f2c8a5a3a8b56f45b102746d4bdafcadcfe5e206d22dcd7d7956759c405cba48 - Sigstore transparency entry: 458264524
- Sigstore integration time:
-
Permalink:
pedrospinosa/axl-workflows@8e57557c3e0e590922d34f0882cd921eb5a395a6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/pedrospinosa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8e57557c3e0e590922d34f0882cd921eb5a395a6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file axl_workflows-0.2.0-py3-none-any.whl.
File metadata
- Download URL: axl_workflows-0.2.0-py3-none-any.whl
- Upload date:
- Size: 35.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8770de0374d17fc3a214a9260f7c905e85622c86845ac0257559517ba495bc40
|
|
| MD5 |
77232b73d51bf35cf98b4e0a878d9a38
|
|
| BLAKE2b-256 |
c578f68f5fafa11b447104b7dec5c02dd0c02cbfa1d3eea9c6dd912624024fcb
|
Provenance
The following attestation bundles were made for axl_workflows-0.2.0-py3-none-any.whl:
Publisher:
release.yml on pedrospinosa/axl-workflows
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
axl_workflows-0.2.0-py3-none-any.whl -
Subject digest:
8770de0374d17fc3a214a9260f7c905e85622c86845ac0257559517ba495bc40 - Sigstore transparency entry: 458264525
- Sigstore integration time:
-
Permalink:
pedrospinosa/axl-workflows@8e57557c3e0e590922d34f0882cd921eb5a395a6 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/pedrospinosa
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@8e57557c3e0e590922d34f0882cd921eb5a395a6 -
Trigger Event:
push
-
Statement type: