Universal configuration and execution layer for AI agents
Project description
agentji
Run any agent skill on any model. One YAML file.
Anthropic's official skills, Clawhub skills — docx, brand-guidelines, data-analysis — work here unchanged, on Qwen, Kimi, MiniMax, or a local Ollama model. Swap the model with one config line. No code changes.
agents:
orchestrator:
model: moonshot/kimi-k2.5 # change this line to switch providers
agents: [analyst, reporter]
analyst:
model: qwen/MiniMax/MiniMax-M2.7
skills: [sql-query, data-analysis]
reporter:
model: qwen/glm-5
skills: [docx-template]
builtins: [bash, write_file]
max_iterations: 20
Orchestrated by Kimi K2.5 · Analysed by MiniMax M2.7 · Reported by GLM-5 · Zero Claude.
pip install agentji
Quickstart
Three paths. Pick the one that fits.
Path A — free, offline, no API keys Uses a local Ollama model. You get a working weather agent in a browser UI.
pip install "agentji[serve]" mcp-weather-server
ollama pull qwen3:4b
cd examples/weather-reporter
agentji serve --studio
Open http://localhost:8000 → ask: "Weather in Seoul, Tokyo, London?"
Path B — cloud models, multi-agent pipeline Three providers, one pipeline. You get a Word document with a full market analysis.
pip install "agentji[serve]" python-docx matplotlib
export MOONSHOT_API_KEY=your_key
export DASHSCOPE_API_KEY=your_key
cd examples/data-analyst && python data/download_chinook.py
agentji serve --studio
Open http://localhost:8000 → ask: "Which markets should we prioritise for growth? Full report."
→ output/growth_strategy.docx is written to disk when the run completes.
Path C — CLI, no server No browser, no server. Pipe it into a script or run it headless.
agentji run --config examples/data-analyst/agentji.yaml \
--agent orchestrator \
--prompt "Which genres are high-margin but low-volume?"
Skills
A skill is a directory with a SKILL.md. Skills from any registry work without modification:
| Skill | Source | Type |
|---|---|---|
sql-query |
Bundled (agentji) | Tool skill |
data-analysis |
ClawHub — ivangdavila | Prompt skill |
| Any Claude Code skill | Anthropic official | Prompt skill |
Claude Code's Anthropic-format skills work here unchanged. The model is a config line.
Two skill types
Prompt skills — the SKILL.md body is injected into the agent's system prompt. Anthropic's official skills (brand-guidelines, docx, data-analysis) are all prompt skills. They work on any model because they're instructions, not code.
Tool skills — a skill.yaml sidecar alongside SKILL.md adds the tool config: script path, parameters, timeout. SKILL.md stays in pure Anthropic format; skill.yaml is the agentji extension.
skills/sql-query/
├── SKILL.md ← pure Anthropic format: name + description + body
├── skill.yaml ← agentji tool config: scripts.execute + parameters
└── scripts/
└── run_query.py
Skill converter
If a skill has callable scripts but no skill.yaml, agentji detects it and offers to auto-generate one using the active agent's model. No separate setup.
Multi-agent orchestration
Set agents: on any agent to make it an orchestrator. agentji injects a call_agent(agent, prompt) tool whose enum constraint limits delegation to declared sub-agents — no hallucinated agent names.
agents:
orchestrator:
model: moonshot/kimi-k2.5
agents: [analyst, reporter] # call_agent tool added automatically
analyst:
model: qwen/MiniMax/MiniMax-M2.7
skills: [sql-query, data-analysis]
reporter:
model: qwen/glm-5
skills: [docx-template]
builtins: [bash, write_file]
Sub-agent calls appear in the same log file — the entire pipeline in one JSONL, linked by a shared pipeline_id.
Model parameters
Pass any litellm-compatible parameter directly to the model — per agent, no code changes:
agents:
writer:
model: qwen/qwen-max
model_params:
temperature: 0.7 # creativity — 0.0 = deterministic, 1.0 = expressive
top_p: 0.9
max_tokens: 4000
seed: 42 # reproducibility (where supported by the model)
analyst:
model: openai/gpt-4o
model_params:
temperature: 0.0 # fully deterministic for numbers and SQL
presence_penalty: 0.1
Parameters unsupported by the target model are silently dropped (drop_params=True). agentji logs a warning listing what was passed, so you can verify intent without config errors. This means you can freely set seed, top_k, or any provider-specific param — if the model doesn't support it, it's ignored.
Multimodal I/O
Declare what each agent accepts and produces:
agents:
vision-analyst:
model: qwen/qwen-vl-max
accepted_inputs: [text, image] # agent accepts images alongside text
output_format: text
image-generator:
model: qwen/wanx2.1-t2i-plus
accepted_inputs: [text]
output_format: image # final response is a path to the generated image
Sending images as input
Via Studio — click 📎 to attach images before sending. Files are uploaded to .agentji/uploads/, shown as thumbnail chips, and sent as base64 image content. When an agent returns an image path, Studio renders it inline.
Via API — upload first, then include the returned path in the message:
# 1. Upload the file
curl -X POST http://localhost:8000/v1/files/upload \
-F "file=@photo.png"
# → {"path": ".agentji/uploads/a1b2c3d4.png", "filename": "a1b2c3d4.png"}
# 2. Send as a multimodal message
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "What colour is dominant in this image?"},
{"type": "image_url", "image_url": {"url": "http://localhost:8000/v1/media/.agentji/uploads/a1b2c3d4.png"}}
]
}]
}'
Passing images between agents
Orchestrators pass images to sub-agents explicitly via attachments. The LLM decides when to include them; agentji base64-encodes the files and injects them as image content blocks into the sub-agent's first message:
# The orchestrator LLM will emit:
# call_agent(agent="vision", prompt="Describe this", attachments=["/path/to/img.png"])
agents:
orchestrator:
model: qwen/qwen-max
agents: [vision, image-gen]
vision:
model: qwen/qwen-vl-max
accepted_inputs: [text, image] # receives the image from orchestrator
output_format: text
image-gen:
model: qwen/wanx2.1-t2i-plus
accepted_inputs: [text]
output_format: image # returns a local path; Studio renders it inline
Paths in attachments should be in the run scratch directory or .agentji/uploads/. The choice is explicit — the orchestrator decides which files each sub-agent receives.
MCP servers
Declare an MCP server in YAML; agentji connects via FastMCP and exposes its tools to the agent automatically.
mcps:
- name: weather
command: python
args: [-m, mcp_weather_server] # launched as subprocess, stdio transport
agents:
weather-reporter:
model: ollama/qwen3:4b
mcps: [weather] # tools discovered at runtime
agentji serve
pip install "agentji[serve]"
# API only (default) — suitable for production, CI, headless deployments
agentji serve --config agentji.yaml --port 8000
# API + Studio browser UI
agentji serve --config agentji.yaml --port 8000 --studio
| Endpoint | Description |
|---|---|
POST /v1/chat/completions |
OpenAI-compatible, streaming, returns X-Agentji-Run-Id header |
GET /v1/events/{run_id} |
SSE stream of all agent events (tool calls, sub-agent delegations) |
GET /v1/pipeline |
Pipeline topology JSON (includes accepted_inputs, output_format per agent) |
POST /v1/files/upload |
Upload a file; returns {"path": "...", "filename": "..."} for use in messages |
GET /v1/media/{path} |
Serve a local file inline — used by Studio for image/audio/video rendering |
GET /v1/files/{path} |
Download a file produced by the agent (attachment disposition) |
POST /v1/sessions/{id}/end |
End a session and trigger skill improvement extraction |
GET / |
agentji Studio (only when --studio flag is set) |
Behind a reverse proxy
If your infrastructure routes traffic through a non-stripping reverse proxy (e.g. RUN:AI, Kubernetes ingress, or any proxy that forwards the full path including the prefix), use --root-path:
agentji serve --studio --root-path /tenant/job123
This makes every endpoint reachable under the prefix:
GET /tenant/job123/ → Studio UI
POST /tenant/job123/v1/chat/completions
GET /tenant/job123/v1/events/{run_id}
GET /tenant/job123/v1/pipeline
...
With --root-path set, requests without the prefix return 404 — the server is only reachable through the declared mount point. Leave --root-path empty (the default) for localhost and stripping-proxy deployments; behaviour is identical to today.
Example: Kubernetes ingress with path forwarding
# ingress.yaml — nginx does NOT strip the prefix
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /$1 # disabled (no rewrite)
spec:
rules:
- http:
paths:
- path: /myteam/jobid
pathType: Prefix
backend:
service: { name: agentji, port: { number: 8000 } }
agentji serve --studio --root-path /myteam/jobid
Sessions
Pass X-Agentji-Session-Id to track a conversation across turns. Control history per request:
{ "messages": [...], "stateful": true, "improve": true }
Or configure defaults in YAML:
studio:
stateful: true # carry conversation history across turns
max_turns: 20
agentji Studio
┌──────────────┬────────────────────────┬─────────────┐
│ agent graph │ chat + thinking cards │ live log │
│ skill badges │ streaming response │ SSE events │
│ status dots │ file download links │ stats bar │
└──────────────┴────────────────────────┴─────────────┘
Custom Studio UI
Replace the built-in Studio with your own single-file HTML app:
studio:
custom_ui: ./my-ui/dist/index.html # served at GET / instead of built-in Studio
The path is relative to the directory where agentji serve is launched. The entire /v1/ API remains unchanged — your UI talks to the same endpoints.
API your UI can use:
| Endpoint | Description |
|---|---|
POST /v1/chat/completions |
Send a message (text or multimodal); streaming or JSON |
GET /v1/events/{run_id} |
SSE stream of live agent events for a run |
GET /v1/pipeline |
Pipeline topology — agents, skills, accepted_inputs, output_format |
POST /v1/files/upload |
Upload a file; returns path for use in multimodal messages |
GET /v1/media/{path} |
Serve a file inline (image/audio/video rendering) |
GET /v1/files/{path} |
Download a file produced by the agent |
POST /v1/sessions/{id}/end |
End a session, trigger improvement extraction |
Minimal vanilla HTML example:
<!DOCTYPE html>
<html>
<body>
<input id="msg" placeholder="Ask something…" style="width:400px" />
<button onclick="send()">Send</button>
<pre id="out"></pre>
<script>
const SESSION = crypto.randomUUID();
async function send() {
const msg = document.getElementById('msg').value;
const out = document.getElementById('out');
out.textContent = '';
const res = await fetch('/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Agentji-Session-Id': SESSION,
},
body: JSON.stringify({
messages: [{ role: 'user', content: msg }],
stream: true,
stateful: true,
}),
});
const reader = res.body.getReader();
const dec = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
// Each chunk is a Server-Sent Event line: "data: <token>\n\n"
const lines = dec.decode(value).split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) out.textContent += line.slice(6);
}
}
}
</script>
</body>
</html>
Using a framework (React, Vue, Svelte):
Build to a single inlined HTML file using vite-plugin-singlefile:
npm install -D vite-plugin-singlefile
# add to vite.config.ts: plugins: [viteSingleFile()]
vite build
# output: dist/index.html — a self-contained file, no separate JS/CSS assets
Tips:
-
Read
X-Agentji-Run-Idfrom each response header to subscribe toGET /v1/events/{run_id}for live tool-call visibility -
GET /v1/pipelinereturns the full agent graph — useful for building a sidebar or status display -
Use
X-Agentji-Session-Idon every request andstateful: trueto maintain conversation history -
Parallel tool calls grouped with a left border
-
context_write/context_readevents in amber — file handoffs between agents -
Orchestrator step tracker — live phase list with pending → running → done status
-
Iteration limit banner with Continue button — never lose work at
max_iterations -
■ Stop button — cancel a run at the next iteration boundary
-
📎 File upload — attach images before sending; thumbnails shown as chips; images included as multimodal content
-
Inline media rendering — image paths (
.png,.jpg,.gif,.webp) render as embedded images inline in chat; agent responses returning image file paths auto-render -
File download links —
.docx,.csv,.mdpaths become clickable -
Stateful toggle — switch between stateful and stateless sessions in the header
-
Skill improvement checkbox — opt individual sessions in/out of improvement extraction
Skill improvement
At session end, agentji uses the configured model to review the conversation and extract three types of learning signals — corrections, affirmations, and hints — then appends them to each skill's improvements.jsonl:
improvement:
enabled: true
model: null # null = inherit default agent model
skills: [] # empty = all loaded skills
Signal types written to skills/sql-query/improvements.jsonl:
{"type": "correction", "skill": "sql-query", "learning": "Use InvoiceLine.UnitPrice * Quantity for revenue, not Invoice.Total.", "context": "User corrected a query that used Invoice.Total which includes tax adjustments."}
{"type": "hint", "skill": "sql-query", "learning": "The Chinook database covers 2009–2013 only; scope date filters to this range.", "context": "User noted this mid-conversation."}
Session end is triggered automatically on tab close, via POST /v1/sessions/{id}/end, or after 30 seconds of inactivity. The Studio checkbox lets users opt sessions in/out individually.
Built-in tools
| Builtin | What it does |
|---|---|
bash |
Execute shell commands |
read_file |
Read a file from disk |
write_file |
Write a file to disk |
These replicate the native tools Claude Code provides, enabling prompt skills that rely on file I/O to run on any model.
Provider support
| Provider | Model string | Notes |
|---|---|---|
| Qwen (DashScope) | qwen/qwen-max |
|
| MiniMax (DashScope) | qwen/MiniMax/MiniMax-M2.7 |
Via DashScope routing |
| GLM (DashScope) | qwen/glm-5 |
|
| Kimi (Moonshot) | moonshot/kimi-k2.5 |
fallback_base_url for China/global auto-detect |
| Anthropic | anthropic/claude-haiku-4-5 |
No base_url needed |
| OpenAI | openai/gpt-4o |
|
| Google Vertex AI | vertex_ai/gemini-1.5-pro |
Service-account JSON auth via vertex_credentials_file |
| Ollama (local) | ollama/qwen3:4b |
Free, runs offline, no API key |
| Any litellm provider | — | full list → |
Dual-endpoint auto-detection — set fallback_base_url for providers with regional endpoints (e.g. Moonshot global vs China). agentji probes both on first use and caches the result.
providers:
moonshot:
api_key: ${MOONSHOT_API_KEY}
base_url: https://api.moonshot.ai/v1
fallback_base_url: https://api.moonshot.cn/v1 # auto-probed on first use
Google Cloud / Vertex AI — authenticate with a service-account JSON file instead of an API key:
providers:
vertex_ai:
vertex_credentials_file: ./vertex_sa.json # path to GCP service-account JSON
# api_key can be omitted for service-account auth
agents:
gemini:
model: vertex_ai/gemini-1.5-pro
system_prompt: "You are helpful."
The JSON file is read at runtime and passed to litellm as vertex_credentials. Use ${VERTEX_SA_JSON_PATH} to interpolate the path from an environment variable.
Roadmap
Shipped
-
Skill translation (SKILL.md → OpenAI tool schema)
-
skill.yaml sidecar (tool config separate from Anthropic format)
-
Skill converter (auto-generate skill.yaml from scripts via LLM)
-
Prompt skills (Anthropic format, body injected into system prompt)
-
Multi-provider routing via litellm
-
Agentic loop via LangGraph
-
MCP server integration via FastMCP
-
Built-in tools (bash, read_file, write_file)
-
Multi-agent orchestration (call_agent with enum constraint)
-
Per-run RunContext (file-based context handoff between agents)
-
Conversation logging (JSONL, pipeline_id, session_id, daily rotation)
-
Provider endpoint auto-detection + caching
-
agentji serve (OpenAI-compatible HTTP endpoint)
-
agentji Studio (chat UI, pipeline tree, event log)
-
Studio flag (--studio; API-only by default)
-
Stateful / stateless session toggle (per-config and per-request)
-
Skill improvement extraction (post-session, per-skill improvements.jsonl)
-
Consecutive error intervention (stuck detection)
-
Iteration limit banner with Continue / Stop
-
Per-agent tool timeout (tool_timeout in agentji.yaml)
-
Run cancellation (POST /v1/cancel/{run_id})
-
Parallel sub-agent dispatch (concurrent call_agent fan-out)
-
In-session sliding window compression (token-based, auto/aggressive presets)
-
Long-term memory — LTM injection + fact extraction across runs
-
Custom Studio UI (single-file HTML override via
studio.custom_ui) -
Google Cloud Vertex AI service-account JSON authentication (
vertex_credentials_file) -
Agent
output_formatdeclaration (text / image / audio / video) -
Studio inline media rendering — images, audio, video embedded directly in chat
-
Flexible
model_params— per-agent litellm params (temperature, top_p, seed, …); unsupported params silently dropped with warning -
Multimodal I/O —
accepted_inputsper agent; vision input via Studio upload or API;call_agentattachments for explicit image handoff between agents;/v1/files/upload+/v1/media/endpoints
Coming
- Skill improvement injection (auto-apply corrections to future system prompts)
- Persistent memory (mem0 / Zep)
- Plugin system for community skill registries
Why agentji
Built for developers working across the global AI ecosystem — for teams where Qwen, Kimi, and local models are first-class requirements, not afterthoughts. If you're locked to one provider because your skills won't port, agentji is the unlock.
机 (jī) — machine, engine. The runtime. 集 (jí) — assemble. Skills, models, tools, agents. 极 (jí) — ultimate. Any skill on any model.
Contributing
Issues and PRs welcome. Adding a skill or a provider integration is the best first PR.
pytest # unit tests
pytest -m integration # requires API keys in .env
pytest -m local # requires Ollama running locally
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file agentji-0.11.0.tar.gz.
File metadata
- Download URL: agentji-0.11.0.tar.gz
- Upload date:
- Size: 125.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
793a74c2e160dc45fdcd0d7c1220bb173232391b23c2c357813cae0fc971204f
|
|
| MD5 |
00327fa48c0222dd43aeb186c1bfe6f2
|
|
| BLAKE2b-256 |
d31b13fcaa768a37904a9dbaab59e40984e6f17b5b142481677d934c4748fe86
|
File details
Details for the file agentji-0.11.0-py3-none-any.whl.
File metadata
- Download URL: agentji-0.11.0-py3-none-any.whl
- Upload date:
- Size: 80.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0769600ca52847a2d1a5ce09cb80ce3522aaf3894e2cf19cb613ace48631d0dc
|
|
| MD5 |
7c88e415800defde999cbc17be17abb9
|
|
| BLAKE2b-256 |
46bbd1ecf9e7b0b5b78e67c2b3183df9420223547e7e17426bd3f975457e90f2
|