model-compose: Declarative AI Workflow Orchestrator
Project description
model-compose
Compose AI Systems, Deploy Anywhere.
Define workflows, agents, models, and comprehensive AI services in YAML. Run them locally, scale them in production, and deploy across any environment without rewriting your stack.
Inspired by docker-compose — one YAML file defines your entire AI system.
Highlights
- Any model, anywhere — run models locally via HuggingFace, vLLM, and llama.cpp, or connect to OpenAI, Anthropic, Google, and more
- 20+ components ready — models, agents, HTTP clients, vector/graph stores, shell commands, and more
- Built-in data stores — Chroma, FAISS, Milvus, Qdrant, Neo4j, ArangoDB, Redis
- Deploy as container — Docker, native containers, or standalone process with one config
- Serve any protocol — HTTP REST, WebSocket, or MCP with one line change
- Distributed execution — scale across machines with Redis-backed queue dispatch
- Instant Web UI — add a Gradio-powered interface with 2 lines of YAML
Installation
pip install model-compose
Or install from source:
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .
Requires: Python 3.10 or higher
Quick Start
Define your AI runtime in a model-compose.yml:
controller:
adapter:
type: http-server
port: 8080
webui:
port: 8081
workflows:
- id: chat
default: true
jobs:
- component: chatgpt
components:
- id: chatgpt
type: http-client
base_url: https://api.openai.com/v1
path: /chat/completions
method: POST
headers:
Authorization: Bearer ${env.OPENAI_API_KEY}
body:
model: gpt-4o
messages:
- role: user
content: ${input.prompt}
Create a .env file:
OPENAI_API_KEY=your-key
Run it:
model-compose up
Your AI runtime is now serving at http://localhost:8080 with Web UI at http://localhost:8081.
Explore examples for more workflows or read the Documentation.
Core Capabilities
Declarative YAML Configuration
Define your entire AI system in a single YAML file. Workflows, agents, models, APIs, vector/graph stores, and runtimes — all composed and deployed together without custom code.
controller:
adapter:
type: http-server
port: 8080
workflows:
- id: chat
default: true
jobs:
- component: chatgpt
components:
- id: chatgpt
type: http-client
base_url: https://api.openai.com/v1
action:
path: /chat/completions
method: POST
Flexible Component System
20+ reusable component types. Mix HTTP clients, local models, vector stores, shell commands, and workflows in any combination. Define once, use everywhere.
components:
- id: chatgpt
type: http-client
- id: local-llm
type: model
- id: assistant
type: agent
- id: knowledge
type: vector-store
- id: cache
type: key-value-store
- id: runner
type: shell
Advanced Workflow Composition
Chain jobs with conditional logic, parallel execution, and data transformation. Pass data between jobs with variable binding — ${input}, ${response}, ${env} — with type conversion and defaults.
workflows:
- id: rag-pipeline
jobs:
- id: embed
component: embedder
input:
text: ${input.query}
- id: search
component: vector-store
action: search
input:
vector: ${jobs.embed.output}
depends_on: [embed]
- id: answer
component: chatgpt
input:
context: ${jobs.search.output}
question: ${input.query}
depends_on: [search]
AI Agent Components
Build autonomous AI agents that use workflows as tools. Agents reason, plan, and execute multi-step tasks by dynamically invoking other workflows — all defined declaratively in YAML.
components:
- id: research-agent
type: agent
tools:
- search-web
- fetch-page
max_iteration_count: 10
action:
model:
component: chatgpt
input:
messages: ${messages}
tools: ${tools}
system_prompt: You are a web research assistant.
user_prompt: ${input.question}
Human-in-the-Loop
Add approval gates and user input steps to any workflow. Workflows pause, prompt for human input via CLI, Web UI, or API, and resume seamlessly.
workflows:
- id: write-with-approval
jobs:
- id: write-file
component: file-writer
input:
path: ${input.path}
content: ${input.content}
interrupt:
before:
message: "Approve file write to ${job.input.path}?"
Local Model Execution
Run models from HuggingFace and other sources locally with native support for transformers, vLLM, and PyTorch. Fine-tune models with LoRA/PEFT through YAML configuration.
components:
- id: local-llm
type: model
task: chat-completion
model: HuggingFaceTB/SmolLM3-3B
action:
messages:
- role: user
content: ${input.prompt}
Universal AI Service Integration
Connect to OpenAI, Anthropic, Google, xAI, ElevenLabs, and any custom HTTP API. Mix and match providers in a single workflow.
components:
- id: claude
type: http-client
base_url: https://api.anthropic.com/v1
action:
path: /messages
method: POST
headers:
x-api-key: ${env.ANTHROPIC_API_KEY}
anthropic-version: "2023-06-01"
body:
model: claude-opus-4-20250514
max_tokens: 1024
messages:
- role: user
content: ${input.prompt}
Real-Time Streaming
Built-in SSE (Server-Sent Events) streaming for real-time AI responses. Stream from any provider or local model with automatic chunking and connection management.
workflows:
- id: chat
jobs:
- component: chatgpt
output: ${output as sse-text}
components:
- id: chatgpt
type: http-client
base_url: https://api.openai.com/v1
action:
path: /chat/completions
method: POST
body:
model: gpt-4o
messages: ${input.messages}
stream: true
stream_format: json
output: ${response[].choices[0].delta.content}
Built-in Data Store Integration
Native integration with Chroma, FAISS, Milvus, Qdrant for vector search. Neo4j and ArangoDB for graph stores. Redis for key-value storage. Build RAG systems with embedding search and semantic retrieval.
components:
- id: knowledge
type: vector-store
driver: chroma
actions:
- id: insert
collection: docs
method: insert
vector: ${input.vector}
metadata:
text: ${input.text}
- id: search
collection: docs
method: search
query: ${input.vector}
Deploy in Any Runtime
Run in native, process, Docker, or native container mode. The same configuration works across all runtimes — switch with one line.
controller:
runtime:
type: docker
image: my-ai-service:latest
ports:
- "8080:8080"
adapter:
type: http-server
port: 8080
Protocol Adapters
Serve over HTTP REST, WebSocket, or MCP (Model Context Protocol) by changing a single line. Includes concurrency control, health checks, and automatic API documentation.
# HTTP REST
controller:
adapter:
type: http-server
port: 8080
# MCP (Model Context Protocol)
controller:
adapter:
type: mcp-server
port: 8080
Distributed Workflow Execution
Scale AI workloads across multiple machines using Redis-backed queue dispatch. Add workers to scale horizontally without shared filesystem or code changes.
controller:
adapter:
type: http-server
port: 8080
queue:
driver: redis
host: localhost
port: 6379
name: my-queue
Webhook and Callback Listeners
HTTP callback listeners for async workflows and HTTP trigger listeners for webhooks. Build reactive AI systems that respond to real-world events.
listener:
type: http-trigger
port: 8091
triggers:
- path: /webhook
method: POST
workflow: handle-message
input:
text: ${body.message.text}
Gateway and Tunnel Support
Expose local services to the internet with ngrok, Cloudflare, or SSH tunnels. Integrate webhooks and deploy public APIs without complex networking.
gateway:
type: http-tunnel
driver: ngrok
port:
- 8090
Instant Web UI
Add a visual interface with 2 lines of YAML. Get a Gradio-powered chat UI or serve custom static frontends for testing and debugging.
controller:
webui:
driver: gradio
port: 8081
Architecture
Protocol adapters → Composition engine → Runtime executors
Contributing
We welcome all contributions! Whether it's fixing bugs, improving docs, or adding examples — every bit helps.
# Setup for development
git clone https://github.com/hanyeol/model-compose.git
cd model-compose
pip install -e .[dev]
License
MIT License © 2025-2026 Hanyeol Cho.
Contact
Have questions, ideas, or feedback? Open an issue or start a discussion on GitHub Discussions.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file model_compose-0.4.64.tar.gz.
File metadata
- Download URL: model_compose-0.4.64.tar.gz
- Upload date:
- Size: 229.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3784d29f5f649c61cc3ff74ac8fc66af251df87939c20bd11a99874565c331ce
|
|
| MD5 |
fc9bc454b316bb22b853753b2a057303
|
|
| BLAKE2b-256 |
6231d9d0b8bd668f061148f52740157435d9cd15ef1b22b82be46ede1502f0b1
|
File details
Details for the file model_compose-0.4.64-py3-none-any.whl.
File metadata
- Download URL: model_compose-0.4.64-py3-none-any.whl
- Upload date:
- Size: 444.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df42d202153e19644442e70555c8eca25aa147f31eba25724d08902043089bbf
|
|
| MD5 |
7aac0c772fd93722387d8a1fcfa65551
|
|
| BLAKE2b-256 |
405fcbc2399da61e873bfd4ee42282b738ed616bc0f40bc9c52630de0d9edc25
|