Skip to main content

Next-Generation ML Pipeline Framework

Project description

🌊 flowyml

flowyml Logo
The Enterprise-Grade ML Pipeline Framework for Humans

CI Status PyPI Version Python 3.10+ License UnicoLab


FlowyML is a lightweight yet powerful ML pipeline orchestration framework. It bridges the gap between rapid experimentation and enterprise production by making assets first-class citizens. Write pipelines in pure Python, and scale them to production without changing a single line of code.

🚀 Why FlowyML?

Feature FlowyML Traditional Orchestrators
Developer Experience 🐍 Native Python - No DSLs, no YAML hell. 📜 Complex YAML or rigid DSLs.
Type-Based Routing 🧠 Auto-Routing - Define WHAT, we handle WHERE. 🔌 Manual wiring to cloud buckets.
Smart Caching Multi-Level - Smart content-hashing skips re-runs. 🐢 Basic file-timestamp checking.
Asset Management 📦 First-Class Assets - Models & Datasets with lineage. 📁 Generic file paths only.
Multi-Stack 🌍 Abstract Infra - Switch local/prod with one env var. 🔒 Vendor lock-in or complex setup.
GenAI Ready 🤖 LLM Tracing - Built-in token & cost tracking. 🧩 Requires external tools.
Build-Time Validation Type Safety - Catches mismatches at build time. 💥 Runtime errors only.
Map Tasks 🗺️ Parallel Maps - @map_task with retries & concurrency. 🔁 Manual parallelism boilerplate.
Dynamic Workflows 🔀 Runtime DAGs - Generate pipelines based on data. 📐 Static definitions only.
GenAI Assets 🎯 Prompt & Checkpoint - First-class prompt versioning and training resumability. 📝 Unmanaged text files.
Stack Hydration 🏗️ YAML → Live Stack - StackConfig.to_stack() wires infra automatically. ⚙️ Manual component assembly.

⚡️ Quick Start

This is a complete, multi-step ML pipeline with auto-injected context:

from flowyml import Pipeline, step, context

@step(outputs=["dataset"])
def load_data(batch_size: int = 32):
    return [i for i in range(batch_size)]

@step(inputs=["dataset"], outputs=["model"])
def train_model(dataset, learning_rate: float = 0.01):
    print(f"Training on {len(dataset)} items with lr={learning_rate}")
    return "model_v1"

# Configure and Run
ctx = context(learning_rate=0.05, batch_size=64)
pipeline = Pipeline("quickstart", context=ctx)
pipeline.add_step(load_data).add_step(train_model)

pipeline.run()

🌟 Key Features

1. 🧠 Type-Based Artifact Routing (New in 1.8.0)

Define artifact types in code, and FlowyML automatically routes them to your cloud infrastructure.

@step
def train(...) -> Model:
    # Auto-saved to GCS/S3 and registered to Vertex AI / SageMaker
    return Model(obj, name="classifier")

2. 🌍 Multi-Stack Configuration

Manage local, staging, and production environments in a single flowyml.yaml.

export FLOWYML_STACK=production
python pipeline.py  # Now runs on Vertex AI with GCS storage

3. 🛡️ Intelligent Step Grouping

Group consecutive steps to run in the same container. Perfect for reducing overhead while maintaining clear step boundaries.

4. 📊 Built-in Observability

Beautiful dark-mode dashboard to monitor pipelines, visualize DAGs, and inspect artifacts in real-time.

5. 🎯 Evaluations Framework

Production-grade evaluation system with 29+ scorers — classification, regression, GenAI (LLM-as-a-judge), and adapters for DeepEval, RAGAS, and Phoenix:

from flowyml.evals import evaluate, EvalDataset, get_scorer

data = EvalDataset.create_genai("my_test", examples=[...])
result = evaluate(data=data, scorers=[get_scorer("relevance"), get_scorer("ragas.faithfulness")])
result.notify_if_regression(threshold=0.05)

6. 🗺️ Map Tasks & Dynamic Workflows

Distribute work over collections with @map_task and generate pipelines at runtime with @dynamic:

from flowyml import map_task, dynamic

@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process_document(doc: dict) -> dict:
    return transform(doc)

@dynamic(outputs=["best_model"])
def hyperparameter_search(config: dict):
    sub = Pipeline("hp_search")
    for lr in config["learning_rates"]:
        sub.add_step(train_with_lr(lr))
    return sub

7. 📦 Artifact Catalog with Lineage

Centralized artifact discovery, tagging, and lineage tracking — works local and remote:

from flowyml import ArtifactCatalog

catalog = ArtifactCatalog()  # Auto-selects local SQLite or remote API
catalog.register(name="classifier", artifact_type="Model", parent_ids=[dataset_id])
lineage = catalog.get_lineage(model_id)  # Full parent→child graph

📦 Installation

# Install core
pip install flowyml

# Install with everything (recommended)
pip install "flowyml[all]"

📚 Documentation

Visit FlowyML Docs for:


Built with ❤️ by UnicoLab

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowyml-1.9.2.tar.gz (4.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowyml-1.9.2-py3-none-any.whl (4.8 MB view details)

Uploaded Python 3

File details

Details for the file flowyml-1.9.2.tar.gz.

File metadata

  • Download URL: flowyml-1.9.2.tar.gz
  • Upload date:
  • Size: 4.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.11.15 Linux/6.17.0-1008-azure

File hashes

Hashes for flowyml-1.9.2.tar.gz
Algorithm Hash digest
SHA256 ed59f02ed5cd262baa3f02a866574bde5a7becaabe35a536592b0a8c7adbc83b
MD5 c49322151bc014d2d0a7e21043c92092
BLAKE2b-256 c5501432e6bb3c5f016320b74995541537dec54f1aa0e420bdf22d775a4a9163

See more details on using hashes here.

File details

Details for the file flowyml-1.9.2-py3-none-any.whl.

File metadata

  • Download URL: flowyml-1.9.2-py3-none-any.whl
  • Upload date:
  • Size: 4.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.11.15 Linux/6.17.0-1008-azure

File hashes

Hashes for flowyml-1.9.2-py3-none-any.whl
Algorithm Hash digest
SHA256 81c605c44ffaafdd15883eddabc10975e0e8b58d5922ff654c7b67288e4e48bc
MD5 4dc7faaa667db6d12c60a80d835911fb
BLAKE2b-256 d9e625e1284c2df942bf7126ddfd6fc8bb2fb06a49baa2d026d4bb27139ea7ac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page