Skip to main content

FlowMesh: A Multi-tenant Service Fabric for LLM Agentic Workflows

Project description

FlowMesh

arXiv License Python 3.12+ Lint Tests

A service fabric for running LLM agentic workflows on distributed GPU workers.

FlowMesh accepts workflow definitions (YAML, JSON, or n8n graph format), parses them into a DAG of tasks, schedules and dispatches each task to a suitable worker, and collects results and artifacts. It supports inference (vLLM, HF transformers, diffusers), training (SFT, LoRA, DPO, PPO), retrieval-augmented generation, agent execution, SSH-style interactive sessions, and arbitrary container jobs.

Architecture

Client (CLI / SDK / HTTP)
    │
    ▼  HTTP (default 8000)
┌─────────────────────────────────────────────────┐
│ Server  (FastAPI orchestrator)                  │
│   • workflow parsing, DAG resolution            │
│   • task scheduling and dispatch                │
│   • result and artifact collection              │
│   • REST API + SSE log streaming                │
└──────────────┬──────────────────────────────────┘
               │  Redis pub/sub  (control + telemetry)
               ▼
┌─────────────────────────────────────────────────┐
│ Supervisor  (per-node agent)                    │
│   • registers node, manages worker containers   │
│   • relays tasks/events via gRPC streams        │
└──────────────┬──────────────────────────────────┘
               │  gRPC (default 50051)
               ▼
┌─────────────────────────────────────────────────┐
│ Worker  (executor process)                      │
│   • vllm, transformers, diffusers, training,    │
│     RAG, agent, SSH, echo, data profiling       │
│   • streams logs and events back via gRPC       │
└─────────────────────────────────────────────────┘

Server and Worker are the two top-level processes. The Supervisor is a subsystem that lives under src/server/ and runs as a child process spawned from the server (multiprocessing.Process); single-node deployments spawn one supervisor child alongside the server, multi-node deployments run a root server plus one supervisor-only server process per worker node.

Quick start

Requires Docker, Docker Compose, and Python 3.12+. If you want to use GPU workers, ensure the NVIDIA Container Toolkit is also installed.

# 1. Install
git clone https://github.com/mlsys-io/FlowMesh.git
cd FlowMesh
pip install uv
uv sync --all-packages --group ci

# 2. Bring up the local stack (Server + Redis + Supervisor)
uv run flowmesh stack up

# 3. Start one CPU worker
uv run flowmesh stack worker up cpu 1

# 4. Submit a workflow
uv run flowmesh workflow submit examples/templates/echo_local.yaml

# 5. Watch it run
uv run flowmesh workflow list
uv run flowmesh workflow watch <workflow_id>

For a GPU worker:

# Pin to specific GPUs (or 'all')
uv run flowmesh stack worker up gpu --targets 0

For inference templates:

uv run flowmesh workflow submit examples/templates/inference_vllm_chat.yaml
uv run flowmesh workflow submit examples/templates/inference_hf_chat.yaml

Tear down:

uv run flowmesh stack worker down all
uv run flowmesh stack down

Workflow format

A minimal single-task workflow:

apiVersion: flowmesh/v1
kind: InferenceTask
metadata:
  name: hello-inference
spec:
  taskType: inference
  resources:
    hardware: { gpu: { type: any, count: 1 } }
  model:
    source: { type: huggingface, identifier: TinyLlama/TinyLlama-1.1B-Chat-v1.0 }
    vllm: { gpu_memory_utilization: 0.5 }
  data:
    type: list
    items:
      - - role: user
          content: What is the capital of France?
  inference: { max_tokens: 64, temperature: 0.0 }
  output:
    destination: { type: http }

Multi-stage DAGs, conditional execution, graph-template prompts, task merging, and SSH sessions are all supported. See examples/templates/ for end-to-end examples and AGENTS.md for the full schema reference.

Extending FlowMesh

FlowMesh exposes plugin hooks for organisations that want to layer additional auth, submission policy, usage tracking, authorisation, supplier attribution, or resource lifecycle behaviour on top of the core server. Install the standalone hook contract with:

pip install "flowmesh[hook]"

A plugin is any Python module that exposes install() returning flowmesh_hook.HookBindings. Plugins are loaded by setting FLOWMESH_PLUGINS to a comma-separated list of importable module names. Plugins can ship as in-tree modules, sibling-mounted packages, or pip-installable wheels — the core never references plugin names.

See docs/PLUGINS.md for the full plugin contract.

Development

# Install dev tooling
uv sync --all-packages --group ci

# Format / lint / type-check
uv run pre-commit run --all-files

# Tests — skip the multiprocessing GPU-cleanup test because it requires a
# real CUDA device and isolated processes; CI also skips it.
uv run pytest tests/ --ignore=tests/worker/test_mp_executor_cleanup_gpu.py

Detailed contributor docs (project layout, env vars, dispatch internals, executor registry, commit-message conventions) live in AGENTS.md.

Contributing

We welcome bug fixes, new features, documentation improvements, and feedback. Please read CONTRIBUTING.md for the contributor setup, code style, testing, dependency-pin, and DCO sign-off conventions, and AGENTS.md for a deeper architecture and source-layout tour.

Citation

If you use FlowMesh in your research, please cite:

@misc{shen2025flowmesh,
      title={FlowMesh: A Service Fabric for Composable LLM Workflows}, 
      author={Junyi Shen and Noppanat Wadlom and Lingfeng Zhou and Dequan Wang and Xu Miao and Lei Fang and Yao Lu},
      year={2025},
      eprint={2510.26913},
      archivePrefix={arXiv},
      primaryClass={cs.DC},
      url={https://arxiv.org/abs/2510.26913}, 
}

License

Apache License 2.0. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowmesh-0.1.1.tar.gz (10.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

flowmesh-0.1.1-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file flowmesh-0.1.1.tar.gz.

File metadata

  • Download URL: flowmesh-0.1.1.tar.gz
  • Upload date:
  • Size: 10.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for flowmesh-0.1.1.tar.gz
Algorithm Hash digest
SHA256 05af0e5ffefbeb0921cc019066bfc4edb37eca4647282a284986884b48f114a7
MD5 be98812bb429ff9dd3f0c12563d7b5c5
BLAKE2b-256 036b72596d3ae134d30446b91a11873c5a5562405a507bf94e0f6f69eb09cdcd

See more details on using hashes here.

Provenance

The following attestation bundles were made for flowmesh-0.1.1.tar.gz:

Publisher: release.yml on mlsys-io/FlowMesh

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file flowmesh-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: flowmesh-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for flowmesh-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8ae29f59737625303dba8432d027f5b2a95ea29f8356e5e1845c52c278b47a76
MD5 46c77294932b53f8fd45bd4cf212a4e2
BLAKE2b-256 aef2c0545fa4ea5af9dab0609fc17098c07ca26bad2e494cba55dd5a13ed5fb8

See more details on using hashes here.

Provenance

The following attestation bundles were made for flowmesh-0.1.1-py3-none-any.whl:

Publisher: release.yml on mlsys-io/FlowMesh

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page