Skip to main content

Lightweight framework for building data and ML workflows with class-based Python syntax

Project description

AXL Workflows Logo

CI PyPI Python

AXL Workflows (axl) is the workflow authoring layer of the AXL ML Platform — a full AI platform from development to deployment. It lets teams define data and ML workflows as plain Python classes and compile them to any runtime provider.

  • Local runtime → fast iteration on your machine.
  • Argo Workflows → production Kubernetes pipelines.
  • Kubeflow Pipelines → KFP-native execution (coming in v0.4.0).

Write once → compile to any runtime. No YAML, no vendor lock-in.

axl-workflows is one product in the AXL ML Platform. For cluster ops and infrastructure bootstrap, see axlctl.


🚀 Quick Start

# Install
pip install axl-workflows

# Or with uv
uv pip install axl-workflows

# Create your first workflow
axl --help

✨ Key Features

  • Class-based DSL: Define workflows as Python classes, with steps as methods and a dag() to wire them.

  • Simple params: Treat parameters as a normal step that returns a Python object (e.g., a Pydantic model or dict). No special Param/Artifact classes.

  • IO Handlers: Steps return plain Python objects; axl persists/loads them via an io_handler (default: pickle).

    • Per-step override (@step(io_handler=...))
    • Input modes: receive objects by default or file paths with input_mode="path".
  • Intermediate Representation (IR): Backend-agnostic DAG model (nodes, edges, resources, IO metadata).

  • Multiple backends:

    • Local runtime → develop and iterate quickly.
    • Argo Workflows → YAML generation for production Kubernetes pipelines.
    • Kubeflow Pipelines → KFP pipeline packages (coming in v0.4.0).
  • Unified runner image: One container executes steps locally and in Argo pods.

  • Resource & retry hints: Declare CPU, memory, caching, retries, and conditions at the step level.

  • CLI tools: Compile, validate, run locally, or render DAGs.


📦 Example Workflow (params as a step, with Pydantic)

# examples/churn_workflow.py
from axl import Workflow, step
from pydantic import BaseModel

# Parameters are just a normal step output (typed with Pydantic for convenience).
class TrainParams(BaseModel):
    seed: int = 42
    input_path: str = "data/raw.csv"

class ChurnTrain(Workflow):
    # Workflow configuration via class attributes
    name = "churn-train"
    image = "ghcr.io/axl-platform/axl-workflows/runner:0.3.0"
    io_handler = "pickle"

    @step
    def params(self) -> TrainParams:
        # Use defaults here; optionally read from YAML/env if you prefer.
        return TrainParams()

    @step  # default io_handler = pickle
    def preprocess(self, p: TrainParams):
        import pandas as pd
        df = pd.read_csv(p.input_path)
        # ... feature engineering ...
        return df  # persisted via pickle (default)

    @step
    def train(self, features, p: TrainParams):
        from sklearn.ensemble import RandomForestClassifier
        import numpy as np
        y = (features.sum(axis=1) > features.sum(axis=1).median()).astype(int)
        X = features.select_dtypes(include=[np.number]).fillna(0)
        model = RandomForestClassifier(n_estimators=50, random_state=p.seed).fit(X, y)
        return model  # persisted via pickle

    @step
    def evaluate(self, model) -> float:
        # pretend evaluation
        return 0.9123

    def dag(self):
        p = self.params()
        feats = self.preprocess(p)
        model = self.train(feats, p)
        return self.evaluate(model)

Variations

  • Receive a file path instead of an object:

    from pathlib import Path
    
    @step(input_mode={"features": "path"})
    def profile(self, features: Path) -> dict:
        return {"bytes": Path(features).stat().st_size}
    
  • Override the io handler (e.g., Parquet for DataFrames):

    from axl.io.parquet_io import parquet_io_handler
    
    @step(io_handler=parquet_io_handler)
    def preprocess(self, p: TrainParams):
        import pandas as pd
        return pd.read_csv(p.input_path)  # saved as .parquet; downstream gets a DataFrame
    

🛠 CLI

# Compile to Argo Workflows YAML
axl compile -m examples/churn_workflow.py:ChurnTrain --target argo --out churn.yaml

# Compile to Kubeflow Pipelines package (v0.4.0+)
axl compile -m examples/churn_workflow.py:ChurnTrain --target kfp --out pipeline.yaml

# Run locally
axl run local -m examples/churn_workflow.py:ChurnTrain

# Validate workflow definition
axl validate -m examples/churn_workflow.py:ChurnTrain

# Render DAG graph
axl render -m examples/churn_workflow.py:ChurnTrain --out dag.png

For cluster lifecycle, storage init, and runner image ops, use axlctl.


📐 Architecture

axl-workflows is Layer 1 of the AXL ML Platform:

┌─────────────────────────────────────────────────┐
│  LAYER 5: MONITOR     (future: axl-monitor)     │
├─────────────────────────────────────────────────┤
│  LAYER 4: SERVE       axl-serving               │
├─────────────────────────────────────────────────┤
│  LAYER 3: MANAGE      axl-model-registry        │
├─────────────────────────────────────────────────┤
│  LAYER 2: EXECUTE     axl-etl                   │
├─────────────────────────────────────────────────┤
│  LAYER 1: AUTHOR → COMPILE → RUN  ← (here)     │
├─────────────────────────────────────────────────┤
│  OPS (cross-cutting)  axlctl (separate repo)    │
└─────────────────────────────────────────────────┘

Within this repo, the layers are:

  1. Authoring Layer

    • Python DSL: @step decorator, Workflow base class
    • Params are a normal step (often a Pydantic model)
    • Configuration via class attributes (name, image, io_handler)
    • IO handled by io_handlers (default: pickle)
    • Wire dependencies via dag() (auto-inferred in v0.3.0+)
  2. IR (Intermediate Representation)

    • Backend-agnostic DAG: nodes, edges, inputs/outputs, resources, retry policies, IO metadata
  3. Compilers

    • Argo: IR → Argo Workflow YAML
    • KFP: IR → Kubeflow Pipelines package (v0.4.0+)
    • Plugin architecture (v0.4.0+) — add any target via entry points
  4. Runtime

    • Unified runner image (axl-runner) executes steps in pods and locally
    • Handles env (via uv), IO handler save/load, structured logging, retries
  5. CLI

    • axl compile, axl run local, axl validate, axl render
    • axl pack, axl build-image (v0.5.0+)

📂 Project Structure

axl/
  core/          # DSL: decorators, base classes, typing
  io/            # io_handlers (pickle default; parquet/npy/torch optional)
  ir/            # Intermediate Representation (nodes, edges, workflows)
  compiler/      # Backend compilers (Argo, Kubeflow)
  runtime/       # Runner container + IO + env setup (uv)
  cli.py         # CLI entrypoint
examples/
  churn_workflow.py
tests/
  test_core.py   # Tests for DSL components
  test_ir.py     # Tests for IR components
pyproject.toml
README.md

🎯 Why AXL Workflows?

  • Local development is fast and simple.

  • Argo/KFP is production-grade but YAML is verbose and hard to get started with.

  • axl bridges the gap:

    • Simple, class-based DSL — no YAML, no vendor-specific decorators
    • Params as a normal step — no special Param/Artifact classes
    • IO handlers for painless object ↔ file persistence
    • Backend-agnostic IR — one workflow definition, multiple compile targets
    • Compile once, run anywhere

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

axl_workflows-0.3.0.tar.gz (77.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

axl_workflows-0.3.0-py3-none-any.whl (55.1 kB view details)

Uploaded Python 3

File details

Details for the file axl_workflows-0.3.0.tar.gz.

File metadata

  • Download URL: axl_workflows-0.3.0.tar.gz
  • Upload date:
  • Size: 77.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for axl_workflows-0.3.0.tar.gz
Algorithm Hash digest
SHA256 9349f420d0ecf790f49285b7fac03fb7d5b9079b5e0522d7e696ed2ca049c89c
MD5 a3ab8308af083d26d9eb1d776d31018e
BLAKE2b-256 3d1bf1499dc7e328e20ef4f32b86c90d0954b5ad5ecc3196e5710fb71b093455

See more details on using hashes here.

Provenance

The following attestation bundles were made for axl_workflows-0.3.0.tar.gz:

Publisher: release.yml on axl-platform/axl-workflows

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file axl_workflows-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: axl_workflows-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 55.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for axl_workflows-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b2443075826b7a61c98462775532d19bb46575cc9874072f947cc18ceff9235f
MD5 c42e5b3e7d16a89dbf1e7f82f2937e38
BLAKE2b-256 c9f7dcb51e9af72ab891fa0cba66b4683ae3c46eef08fecf341acef4a1bf68f1

See more details on using hashes here.

Provenance

The following attestation bundles were made for axl_workflows-0.3.0-py3-none-any.whl:

Publisher: release.yml on axl-platform/axl-workflows

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page