Dynamic multi-model deliberation gateway
Project description
Chimera — Dynamic Multi-Model Deliberation Gateway
One API call. A team of models. One answer.
Chimera takes your prompt, dispatches it to a hand-picked team of LLMs (each with a custom subtask scoped to their strengths), and an aggregator merges their outputs using dispatcher-written instructions. One dispatcher model call designs the entire deliberation at once.
Quickstart
# Install
pip install chimera-deliberation[full]
# Or for just the API server:
pip install chimera-deliberation[server]
# Configure
cp chimera.yaml.example chimera.yaml
# Add your API keys (at minimum: DEEPSEEK_API_KEY)
# Run
chimera "What is the capital of France?" # CLI
chimera serve # REST API + web UI
chimera-mcp # MCP tools for agents
Open http://localhost:8765/web/ for the web UI with live DAG visualization.
Python:
from chimera import Engine, ChimeraConfig, load_config
config = load_config()
engine = Engine(config, LiteLLMGateway(config))
result = await engine.deliberate("Explain quantum computing.")
print(result.answer) # merged output from multiple models
OpenAI-compatible:
curl -X POST http://localhost:8765/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "auto", "messages": [{"role": "user", "content": "Hello"}]}'
Architecture
flowchart TB
subgraph Client
A[User Prompt]
end
subgraph Dispatcher["Dispatcher (1 call)"]
B[Designs DAG<br/>Picks models by category<br/>Writes custom prompts<br/>Writes merge instructions]
end
subgraph Workers["Workers (parallel)"]
C1[Worker A<br/>domain-scoped task]
C2[Worker B<br/>domain-scoped task]
C3[Worker C<br/>domain-scoped task]
end
subgraph Aggregation
D[Aggregator<br/>merges with dispatcher instructions]
end
A --> B
B --> C1
B --> C2
B --> C3
C1 --> D
C2 --> D
C3 --> D
D --> E[Final Answer]
Formation Types
flowchart LR
subgraph Simple["Simple (2 workers)"]
S1[W1] --> SA[Aggregator]
S2[W2] --> SA
end
subgraph Debate["Debate (3 workers + merge)"]
D1[W1] --> DA1[Agg 1]
D2[W2] --> DA1
D2 --> DA2[Agg 2]
D3[W3] --> DA2
DA1 --> DM[Merge]
DA2 --> DM
end
subgraph Custom["Custom DAG (client-defined)"]
C1[Researcher] --> C2[Critic]
C2 --> C3[Polisher]
C3 --> C4[Final]
end
Flow: Request to Answer
sequenceDiagram
participant Client
participant Engine
participant Dispatcher
participant Workers
participant Aggregator
Client->>Engine: POST /v1/chat/completions
Engine->>Dispatcher: Design formation
Dispatcher->>Dispatcher: Pick models by category weights
Dispatcher->>Dispatcher: Write per-worker prompts + merge instructions
Dispatcher-->>Engine: DAG + prompts + instructions
par Workers (parallel)
Engine->>Workers: Worker A (custom prompt)
Engine->>Workers: Worker B (custom prompt)
Engine->>Workers: Worker C (custom prompt)
Workers-->>Engine: responses
end
Engine->>Aggregator: Merge with dispatcher instructions
Aggregator-->>Engine: Final answer
Engine-->>Client: JSON response + trace
Quick Start
# Install
pipx install chimera[full]
# Configure
cp chimera.yaml.example chimera.yaml
# Edit chimera.yaml with your API keys
# Run
chimera deliberate "Compare React, Vue, and Svelte for a real-time dashboard"
# Or as an API server
chimera serve
# → http://localhost:8000/v1/chat/completions (OpenAI-compatible)
# → http://localhost:8000/docs (OpenAPI docs)
# Or as MCP server for Hermes/AI agents
chimera-mcp
Model Selection
The dispatcher picks models using category-weighted scoring:
| Category | What it measures |
|---|---|
code |
Programming, debugging, software engineering |
analysis |
Data analysis, research, evaluation |
reasoning |
Logic, math, complex problem-solving |
design |
Creative work, UX, content generation |
audit |
Fact-checking, safety, correctness review |
Each model in the catalog has a score (0.0–1.0) per category. The dispatcher matches task domains to model strengths. You can override any model choice per request:
{
"model": "auto",
"messages": [{"role": "user", "content": "..."}],
"dispatcher_model": "deepseek/deepseek-v4-flash",
"allowed_models": ["deepseek/deepseek-v4-pro", "z-ai/glm-5.2"],
"stage_models": {"worker_1": "openrouter/anthropic/claude-sonnet-4"}
}
Custom DAGs
Send your own formation structure at request time:
{
"model": "custom",
"allow_custom_dag": true,
"dag": {
"stages": [
{"id": "researcher", "kind": "worker", "model": "openrouter/anthropic/claude-sonnet-4"},
{"id": "critic", "kind": "aggregator", "model": "z-ai/glm-5.2", "depends_on": ["researcher"]},
{"id": "polisher", "kind": "worker", "model": "deepseek/deepseek-v4-pro", "depends_on": ["critic"]},
{"id": "final", "kind": "aggregator", "model": "openrouter/anthropic/claude-sonnet-4", "depends_on": ["polisher"]}
],
"edges": [["researcher","critic"], ["critic","polisher"], ["polisher","final"]]
},
"messages": [{"role": "user", "content": "..."}]
}
The dispatcher writes custom prompts for each stage but uses YOUR structure exactly.
Interfaces
| Interface | Endpoint | Use |
|---|---|---|
| REST API | POST /v1/chat/completions |
OpenAI-compatible drop-in |
| REST API | POST /v1/deliberate |
Full control (DAG, overrides, trace) |
| REST API | GET /v1/models |
Model catalog with weights |
| REST API | GET /v1/formations |
Available formation presets |
| CLI | chimera deliberate |
Command-line usage |
| MCP | chimera_deliberate |
Hermes / AI agent integration |
Response Trace
Every deliberation returns a full trace:
request_id: a71b3f2c...
formation: auto
source: auto ← not "fallback" — dispatcher designed it
total_tokens: 12345
total_cost: $0.012
total_duration_ms: 15234
dispatch: V4 Flash (1,234ms, 800+420 tok)
workers:
worker_rust: Claude Sonnet 4 (4,500ms, 300+1,200 tok)
worker_go: Kimi K2.7 (8,200ms, 250+800 tok)
aggregator: V4 Flash (2,100ms, 3,000+500 tok)
Providers
Chimera uses LiteLLM under the hood. Supported providers:
| Provider | Direct API | Via OpenRouter | Models |
|---|---|---|---|
| DeepSeek | ✅ | ✅ | v4-flash, v4-pro, r1 |
| Anthropic | — | ✅ | Sonnet 4, Opus 4.7/4.8, Haiku 4.5 |
| OpenAI | — | ✅ | GPT-5.5, GPT-5.1 |
| xAI | — | ✅ | Grok 4.20 |
| — | ✅ | Gemini 3.5 Flash, 3.1 Pro, 2.5 Flash | |
| Z.AI | ✅ (direct) | ✅ | GLM-5.2 |
| MoonshotAI | — | ✅ | Kimi K2.7 Code, K2.6 |
| MiniMax | — | ✅ | M3 |
| Meta | — | ✅ | Llama 4 Maverick |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chimera_deliberation-0.2.0.tar.gz.
File metadata
- Download URL: chimera_deliberation-0.2.0.tar.gz
- Upload date:
- Size: 134.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b990613f74b69a8505d50907229b940acee1f63849057ac05a5c970e3b3f7738
|
|
| MD5 |
873fdde5c9dc068eeb854519fd1e6308
|
|
| BLAKE2b-256 |
8befc4ff64e9894b4457d918f9e9f913974cb873867c30130c0dc3f33d657790
|
File details
Details for the file chimera_deliberation-0.2.0-py3-none-any.whl.
File metadata
- Download URL: chimera_deliberation-0.2.0-py3-none-any.whl
- Upload date:
- Size: 68.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
baa4689104e665b39b300063e4c42915cb831d85c08c2b8ad1804b3d36001791
|
|
| MD5 |
18b0a824ccd6bda14aab280b9649bcf8
|
|
| BLAKE2b-256 |
aa32a94f8e6dad1fed7dffe08d30a8de66ef062ddab1da7ba04c25a926701d26
|