Skip to main content

Next-Generation ML Pipeline Framework

Project description

🌊 flowyml

flowyml Logo
The Enterprise-Grade ML Pipeline Framework for Humans

CI Status PyPI Version Python 3.10+ License UnicoLab


FlowyML is a lightweight yet powerful ML pipeline orchestration framework. It bridges the gap between rapid experimentation and enterprise production by making assets first-class citizens. Write pipelines in pure Python, and scale them to production without changing a single line of code.

🚀 Why FlowyML?

Feature FlowyML Traditional Orchestrators
Developer Experience 🐍 Native Python - No DSLs, no YAML hell. 📜 Complex YAML or rigid DSLs.
Type-Based Routing 🧠 Auto-Routing - Define WHAT, we handle WHERE. 🔌 Manual wiring to cloud buckets.
Smart Caching Multi-Level - Smart content-hashing skips re-runs. 🐢 Basic file-timestamp checking.
Asset Management 📦 First-Class Assets - Models & Datasets with lineage. 📁 Generic file paths only.
Multi-Stack 🌍 Abstract Infra - Switch local/prod with one env var. 🔒 Vendor lock-in or complex setup.
GenAI Ready 🤖 LLM Tracing - Built-in token & cost tracking. 🧩 Requires external tools.
Build-Time Validation Type Safety - Catches mismatches at build time. 💥 Runtime errors only.
Map Tasks 🗺️ Parallel Maps - @map_task with retries & concurrency. 🔁 Manual parallelism boilerplate.
Dynamic Workflows 🔀 Runtime DAGs - Generate pipelines based on data. 📐 Static definitions only.
GenAI Assets 🎯 Prompt & Checkpoint - First-class prompt versioning and training resumability. 📝 Unmanaged text files.
Stack Hydration 🏗️ YAML → Live Stack - StackConfig.to_stack() wires infra automatically. ⚙️ Manual component assembly.

⚡️ Quick Start

This is a complete, multi-step ML pipeline with auto-injected context:

from flowyml import Pipeline, step, context

@step(outputs=["dataset"])
def load_data(batch_size: int = 32):
    return [i for i in range(batch_size)]

@step(inputs=["dataset"], outputs=["model"])
def train_model(dataset, learning_rate: float = 0.01):
    print(f"Training on {len(dataset)} items with lr={learning_rate}")
    return "model_v1"

# Configure and Run
ctx = context(learning_rate=0.05, batch_size=64)
pipeline = Pipeline("quickstart", context=ctx)
pipeline.add_step(load_data).add_step(train_model)

pipeline.run()

🌟 Key Features

1. 🧠 Type-Based Artifact Routing (New in 1.8.0)

Define artifact types in code, and FlowyML automatically routes them to your cloud infrastructure.

@step
def train(...) -> Model:
    # Auto-saved to GCS/S3 and registered to Vertex AI / SageMaker
    return Model(obj, name="classifier")

2. 🌍 Multi-Stack Configuration

Manage local, staging, and production environments in a single flowyml.yaml.

export FLOWYML_STACK=production
python pipeline.py  # Now runs on Vertex AI with GCS storage

3. 🛡️ Intelligent Step Grouping

Group consecutive steps to run in the same container. Perfect for reducing overhead while maintaining clear step boundaries.

4. 📊 Built-in Observability

Beautiful dark-mode dashboard to monitor pipelines, visualize DAGs, and inspect artifacts in real-time.

5. 🎯 Evaluations Framework

Production-grade evaluation system with 29+ scorers — classification, regression, GenAI (LLM-as-a-judge), and adapters for DeepEval, RAGAS, and Phoenix:

from flowyml.evals import evaluate, EvalDataset, get_scorer

data = EvalDataset.create_genai("my_test", examples=[...])
result = evaluate(data=data, scorers=[get_scorer("relevance"), get_scorer("ragas.faithfulness")])
result.notify_if_regression(threshold=0.05)

6. 🗺️ Map Tasks & Dynamic Workflows

Distribute work over collections with @map_task and generate pipelines at runtime with @dynamic:

from flowyml import map_task, dynamic

@map_task(concurrency=8, retries=2, min_success_ratio=0.95)
def process_document(doc: dict) -> dict:
    return transform(doc)

@dynamic(outputs=["best_model"])
def hyperparameter_search(config: dict):
    sub = Pipeline("hp_search")
    for lr in config["learning_rates"]:
        sub.add_step(train_with_lr(lr))
    return sub

7. 📦 Artifact Catalog with Lineage

Centralized artifact discovery, tagging, and lineage tracking — works local and remote:

from flowyml import ArtifactCatalog

catalog = ArtifactCatalog()  # Auto-selects local SQLite or remote API
catalog.register(name="classifier", artifact_type="Model", parent_ids=[dataset_id])
lineage = catalog.get_lineage(model_id)  # Full parent→child graph

📦 Installation

# Install core
pip install flowyml

# Install with everything (recommended)
pip install "flowyml[all]"

📚 Documentation

Visit FlowyML Docs for:


Built with ❤️ by UnicoLab

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowyml-1.10.0.tar.gz (4.7 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowyml-1.10.0-py3-none-any.whl (4.9 MB view details)

Uploaded Python 3

File details

Details for the file flowyml-1.10.0.tar.gz.

File metadata

  • Download URL: flowyml-1.10.0.tar.gz
  • Upload date:
  • Size: 4.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for flowyml-1.10.0.tar.gz
Algorithm Hash digest
SHA256 c88326e2e81f3b92f937bd3c61925b7341c9c65d4cb1208f60cb8c632e74bd24
MD5 9e953cccb755d3b34f3d858ebd447955
BLAKE2b-256 c2a17ba8166ce480c4eb68040b3b498711725e40090c7cd0e194868afdf09581

See more details on using hashes here.

File details

Details for the file flowyml-1.10.0-py3-none-any.whl.

File metadata

  • Download URL: flowyml-1.10.0-py3-none-any.whl
  • Upload date:
  • Size: 4.9 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1010-azure

File hashes

Hashes for flowyml-1.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 02f0be1829b4c1d3a44d84df469521afdc4454a040eb017a511da567bd877b7b
MD5 b3ea8d4ac6159539c77ece7cf6ea51bd
BLAKE2b-256 c0a2732b17c1674e973d1dd64af4219d5227b329afcee35e0f05c7f7533d1eab

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page