Build, compose, and deploy coding agents from modular primitives.

These details have not been verified by PyPI

Project links

Project description

Chimera

AI that reads, writes, edits, and iterates on code with tests. Chimera is a Python library for building these tools yourself, plus a ready-to-run coding agent on top of it.

Status: Alpha — 7720 passing tests, 62 skipped (live integration tests excluded; v0.7.0 baseline measured 2026-05-09). Reproducible benchmarks with GLM-5.1: HumanEval 66.5% pass@1 (109/164), SWE-bench Lite 10% (2/20, top-20 smallest patches). Raw results in data/.

Who This Is For

You build with CLI coding agents. You use terminal-native AI tools daily and you know what it feels like when an agent reads your codebase, edits files, and runs tests from your shell. You want to build your own — with your model, your tools, your rules — or take apart how these agents work to understand why they behave differently.

You're curious about coding agents. You've seen demos of AI writing entire apps. You want to understand what's actually happening — what the pieces are, how the loop works, why some agents are better at certain tasks. Chimera breaks it all down into parts you can inspect, modify, and run yourself.

What It Does

A coding agent is an LLM connected to your filesystem. It reads code, decides what to change, edits files, runs tests, and repeats until the task is done.

Chimera gives you two things:

A coding-agent harness with a plugin system — codebase search, auto-testing, code review, and context management, exposed as hooks, MCP servers, and skills you can wire into any compatible host.
A Python library for building your own coding agents from modular pieces — pick your LLM, pick your tools, pick your strategy, wire them together.

Install

Latest release: v0.7.0 (release notes).

Not yet on PyPI. Install from source:

pip install "git+https://github.com/0bserver07/chimera.git@v0.7.0#egg=chimera-run[anthropic]"   # GLM-5 / Anthropic-compatible
pip install "git+https://github.com/0bserver07/chimera.git@v0.7.0#egg=chimera-run[openai]"      # GPT
pip install "git+https://github.com/0bserver07/chimera.git@v0.7.0#egg=chimera-run[all]"         # anthropic + openai + browser + remote

Requires Python 3.11+. A chimera-run PyPI release is planned post-alpha.

Build Your Own Coding Agent

from chimera.assembly.coding_agent import CodingAgent

# One line — full-featured coding agent with 24 tools.
# Requires ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic and ANTHROPIC_AUTH_TOKEN in env.
agent = CodingAgent(model="glm-5")

# Run a task
import asyncio

async def main():
    async for event in agent.run("Fix the bug in auth.py"):
        print(event.type.value, getattr(event.data, 'content', '')[:100])

asyncio.run(main())

Presets

Preset	Tools	Features
`coding_agent`	24 (bash, read, write, edit, search, git, test, agent, skill, ...)	Permissions, hooks, transcripts, compaction, streaming
`codex`	24	Permissions, transcripts (no hooks)
`minimal`	4 (bash, read, write, edit)	No extras
`explore`	3 (read, search, list)	Read-only

# Codex-style agent
agent = CodingAgent.from_preset("codex", model="gpt-4o")

# Minimal agent for simple tasks
agent = CodingAgent.from_preset("minimal", model="claude-haiku-3.5")

# Custom API endpoint (any Anthropic-compatible API)
import os
os.environ["ANTHROPIC_BASE_URL"] = "https://your-api.com/v1"
os.environ["ANTHROPIC_AUTH_TOKEN"] = "your-key"
agent = CodingAgent(model="your-model")

Architecture

Chimera is modular — every component is replaceable:

CodingAgent
├── Provider (Anthropic, OpenAI, Google, Ollama, or any compatible API)
├── Tools (20+ built-in, plus custom tools, MCP servers, skills)
├── AgentLoop (async generator with streaming, error recovery, abort)
├── Permissions (multi-source rules, 6 modes, interactive prompts)
├── Hooks (27 lifecycle events, shell/LLM/function hooks)
├── Commands (slash commands, skills from .chimera/skills/)
├── Sub-Agents (3-tier context isolation, background tasks)
├── State (content replacement, file cache, session transcripts)
└── Infrastructure (feature flags, analytics, memory, compaction)

See Architecture for the full module map.

Run It Standalone

The Mink CLI ships a fully assembled coding agent with no extra setup:

chimera mink                                    # interactive REPL on Ollama Kimi K2.6 by default
chimera mink -p "summarize this repo"           # one-shot, prints to stdout
chimera mink runs list                          # inspect every persisted run
chimera mink agents list                        # show available agent presets
chimera code                                    # legacy stack with slash commands and session save

chimera mink is the v0.3.0 coding REPL: streaming tool calls, hooks, permissions from .claude/settings.json, MCP, subagents, and a rich TUI on a TTY (auto-disabled when piping; force off with --no-color). See the Mink quickstart for the walking skeleton, env vars, and the runs/agents subcommand surface, and docs/mink/providers.md for the full provider matrix (Ollama, Anthropic, OpenAI, Google, OpenAI-compat).

Hooks run automatically on every edit:

Path validation — blocks edits to files that don't exist (no more hallucinated paths)
Auto-test — finds and runs related tests after every file change
Auto-lint — runs your linter after every edit
Security scan — blocks dangerous bash commands
Verify done — runs the full test suite before the agent can declare "done"

MCP servers give the agent new tools to call:

chimera-search — semantic codebase search + symbol lookup
chimera-review — multi-perspective code review (logic, security, tests, architecture, and 4 more)
chimera-testgen — generate test skeletons from source analysis
chimera-migration — scan for and apply code migrations (Python 2 to 3, CJS to ESM)

The plugin honors a settings.json schema for ecosystem interop, so the same hooks/MCP/skills also drop into any host that follows that convention.

Setup guide — install in 2 minutes.

Discoverability note: Each of the 7 coding-agent CLIs has a purpose alias for tab-friendly invocation: chimera tui ≡ chimera mink, chimera multi ≡ chimera otter, chimera sandbox ≡ chimera ferret, chimera mini ≡ chimera weasel, chimera tiny ≡ chimera shrew, chimera shell ≡ chimera stoat, chimera strict ≡ chimera badger. Run chimera agents to list all seven with one-liner pitches and the upstream tool that inspired each. See docs/inspirations.md.

Otter — server-first coding agent

chimera otter is the second coding-agent CLI. Where chimera mink mirrors a TUI-first agent, otter mirrors a server-first / multi-client open-source coding agent: a single ReAct loop you can drive from a one-shot CLI, an interactive REPL, an HTTP server with SSE streaming, or an ACP JSON-RPC transport — all backed by the same LoopConfig, tool registry, and event-sourced session store the rest of Chimera uses.

Quick install + first run with glm-5.1:cloud (via Ollama's Anthropic-compatible bridge):

uv sync --extra dev --extra anthropic
export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
chimera otter --model glm-5.1:cloud -p "summarize this repo"   # one-shot
chimera otter --model glm-5.1:cloud                            # interactive REPL
chimera otter --model glm-5.1:cloud serve --port 5173          # HTTP + SSE
chimera otter --model glm-5.1:cloud serve --acp                # ACP over stdio

Three transports, one loop:

REPL — streaming text + tool calls, mid-turn steering, Ctrl-C cancel, 26-entry slash-command palette (/help, /share, /agent, /model, /sessions, /compact, …).
HTTP + SSE — /sessions, /sessions/{id}, /sessions/{id}/turns, /sessions/{id}/events (Server-Sent Events). Optional OTTER_SERVER_TOKEN Bearer auth.
ACP — JSON-RPC over stdio for IDE clients that already speak the Agent Client Protocol.

Key flag surface:

chimera otter --model glm-5.1:cloud -p "..."     # pick the provider/model
chimera otter --no-mcp -p "..."                  # disable MCP tool sources
chimera otter --no-rules -p "..."                # ignore project + user rules files
chimera otter --no-plugins -p "..."              # skip directory-loaded plugins
chimera otter --no-lsp -p "..."                  # disable LSP-backed edit verification

See the Otter quickstart for the full walkthrough — provider resolution order, env vars, on-disk layout, and pointers to providers.md, models.md, sessions.md, share.md, and server.md.

Ferret — sandbox-first IDE-flagship coding agent

chimera ferret is the third coding-agent CLI. Where mink is TUI-first and otter is server-first, ferret mirrors the upstream IDE-flagship posture: a sandbox-first runner with single-flag approval presets, an ACP JSON-RPC transport that ships as the default serve transport (HTTP is opt-in), and an optional cloud bridge so a local ferret session can be driven from a remote UI. The two headline guardrails compose: --sandbox blocks at the OS level, --approval blocks at the policy level, and a tool call has to pass both.

uv sync --extra dev --extra openai
export OPENAI_API_KEY=sk-...
chimera ferret -p "audit the repo"                                       # default: read-only sandbox + read-only approval
chimera ferret --sandbox workspace-write --approval auto -p "fix tests"  # writes inside cwd, no asks for safe ops
chimera ferret                                                           # interactive REPL
chimera ferret serve                                                     # ACP over stdio (the default)
chimera ferret serve --http --port 5173                                  # HTTP, opt-in

See the Ferret quickstart for the four entry points, sandbox modes, approval presets, IDE-bridge wiring, and cloud-bridge setup.

Weasel — minimal harness with four operating modes

chimera weasel is the fourth coding-agent CLI. Where mink/otter/ferret each ship strong opinions, weasel mirrors the minimal harness posture: powerful defaults, no sub-agents, no plan mode, no built-in approval presets — just one ReAct loop reachable through four interchangeable I/O envelopes (interactive REPL, one-shot print, stdio JSON-RPC, embedded SDK), an auto-discovered .weasel/extensions/ directory, and an embeddable Agent class. If you want more, you build it (or install an extension); weasel will not get in the way.

uv sync --extra dev --extra anthropic
export ANTHROPIC_API_KEY=sk-ant-...
chimera weasel                                       # mode 1: interactive REPL
chimera weasel -p "summarize TODO comments in src/"  # mode 2: one-shot print (add --json for a single JSON blob)
chimera weasel --mode rpc < requests.jsonl           # mode 3: stdio JSON-RPC server
python -c "from chimera.weasel.sdk import Agent; print(Agent(model='claude-sonnet-4-6').run('list files').text)"  # mode 4: SDK

See the Weasel quickstart for the four modes in detail, the extension layout, and the SDK recipe.

Shrew — coding agent tuned for small local models

chimera shrew is the fifth coding-agent CLI, explicitly tuned for small local models (Qwen3.5-9B, Qwen3.6-35B-A3B MoE, and friends). It is a thin layer on top of weasel — same four modes, same session schema, same extension surface — but with three small-model adjustments: the default model is qwen3.6-35b-a3b served by llama.cpp on 127.0.0.1:8888, --max-steps defaults to 30 (smaller than mink/otter's 50; small models don't benefit from long horizons), and the default --allowed-tools is the restricted Read,Write,Edit,Bash set so a 4-bit quantised model doesn't burn its context budget on tool selection. Cloud fallbacks (anthropic/claude-haiku-4-5, openai/gpt-4o-mini) work via --model vendor/name.

# Build llama.cpp and serve a GGUF on :8888 (see docs/shrew/small-model-setup.md)
./llama-server -m qwen3.6-35b-a3b.Q4_K_M.gguf --host 127.0.0.1 --port 8888 &

uv sync --extra dev
chimera shrew -p "explain this repo"                    # one-shot against the local llama.cpp server
chimera shrew                                           # interactive REPL
chimera shrew --list-models                             # known model identifiers
chimera shrew bench aider-polyglot --bench-limit 5      # small-model benchmark harness
chimera shrew --model anthropic/claude-haiku-4-5 -p "..."  # cloud fallback

See the Shrew quickstart for the small-model setup walkthrough and benchmark harness.

Stoat — shell-mode-toggle coding agent

chimera stoat is the sixth coding-agent CLI. Where the first five each ship rich opinionated postures, stoat's distinguishing ergonomic is the shell-mode toggle: in the same REPL, each line either feeds the LLM agent or runs as a direct shell command, and the user flips between the two with /shell (or --shell-mode on boot). Stoat is for users who live in their terminal and want one buffer for both ls -la and "explain this repo". The provider chain is Kimi-first via $MOONSHOT_API_KEY (kimi-k2.6 on api.moonshot.ai/v1), with Anthropic / OpenAI / OpenRouter / Ollama fallthroughs.

uv sync --extra dev --extra anthropic
export MOONSHOT_API_KEY=...
chimera stoat -p "summarize TODO comments in src/"   # one-shot
chimera stoat                                        # interactive REPL — toggle with /shell
chimera stoat --shell-mode                           # boot directly into shell mode

In the REPL, stoat> is agent mode, stoat$ is shell mode; /shell toggles. Mode-tagged history (/history renders > and $ markers) keeps both modes visible inline. See the Stoat quickstart for the full shell-mode walkthrough.

Badger — harness-rewrite coding agent

chimera badger is the seventh coding-agent CLI. Where stoat's headline is ergonomic, badger's is harness discipline: tighter step budget (--max-steps defaults to 25 vs 50 for the other six), rerun-on-failure as a first-class flag (--rerun-on-failure --max-reruns 2), and a parity-tracker subcommand (chimera badger parity --against PARITY.md) that diffs a declared schema against the live agent's defaults. The provider chain is Anthropic-first.

uv sync --extra dev --extra anthropic
export ANTHROPIC_API_KEY=sk-ant-...
chimera badger -p "Refactor src/util.py to remove duplicated string formatting"
chimera badger -p "fix the failing tests" --rerun-on-failure --max-reruns 3
chimera badger parity --against PARITY.json          # rc=0 on match, rc=1 on diff

The rerun-on-failure detector is a conservative marker list (pytest summaries, Python tracebacks, Rust E0xxx, syntax errors, explicit BUILD FAILED). When fired, the refined-prompt directive names the markers and asks the agent to verify before reporting done. See the Badger quickstart for the rationale and full surface.

How It's Organized

Chimera is an 8-layer stack. Each layer has a documented API boundary; swap any provider, tool, env, or strategy without touching the rest.

What you run        CLI commands: chimera code / synthesize / eval / review / ci-fix / fs
                    ─────────────────────────────────────────────────────────────────
Automated           CI repair, code review, research, migration planning, doc and
workflows           test generation — multi-step pipelines built on the agent layer
                    ─────────────────────────────────────────────────────────────────
Iterating on code   Give it a spec and tests, it keeps trying until the tests pass.
                    Strategies: converge on tests, search a tree of approaches,
                    generate-then-verify (CEGIS), curriculum learning
                    ─────────────────────────────────────────────────────────────────
Measuring quality   Run benchmarks (HumanEval, SWE-bench, AIMO, custom), collect
                    pass rates and costs, compare agent configurations
                    ─────────────────────────────────────────────────────────────────
The agent itself    An LLM in a loop: think, call a tool, observe the result,
                    repeat. 24 built-in tools (read, write, edit, bash, search,
                    git, test, web fetch, etc). 4 loop strategies.
                    ─────────────────────────────────────────────────────────────────
LLM providers       Anthropic, OpenAI, Google, Ollama, Modal, or any
                    OpenAI-compatible API. Streaming, async, cost tracking.
                    ─────────────────────────────────────────────────────────────────
Plumbing            Auth, sessions (save/resume/fork), event bus, permissions,
                    context compaction, secrets, plugins, MCP, LSP
                    ─────────────────────────────────────────────────────────────────
Where code runs     Your filesystem, a Docker container, a git branch,
                    a remote server, or a cloud sandbox

Benchmarks

Reproducible runs with raw data in data/:

Benchmark	GLM-5.1	Raw data
HumanEval (164 problems)	66.5% pass@1 (109/164)	`data/humaneval-glm51-results.json`
SWE-bench Lite (20 smallest patches)	10% (2/20)	`data/swebench-lite-glm51-results.jsonl`

Earlier GLM-5 runs (HumanEval, Terminal-Bench) exist in our notes but the raw result files were not preserved; we won't publish unverifiable numbers. Full transparency report — every benchmark has a status, methodology, and known gaps.

Run It Free with Ollama

Chimera speaks Ollama's Anthropic-compatible API out of the box. You can run the full agent against kimi-k2.6:cloud, glm-5.1:cloud, or any local Qwen/Llama with zero code changes:

export ANTHROPIC_BASE_URL=http://localhost:11434
export ANTHROPIC_AUTH_TOKEN=ollama
python examples/agent/ollama_coding_agent.py --model kimi-k2.6:cloud

Full Ollama setup guide — prerequisites, recommended models, context window notes, troubleshooting.

When to Reach for Chimera

Use Chimera if you want to:

Run a coding agent on your own model (local Ollama, GLM, GPT, Anthropic-compatible) with hooks, MCP, and skills wired in
Build your own coding agent — different LLM, different tools, different strategy
Understand how coding agents work — every major architecture decomposed into swappable pieces
Research and benchmark — compare agent architectures with controlled experiments

Links

Quick Start — hooks, MCP servers, skills
Coding Agents Overview — comparative tour of the seven-strong family (mink, otter, ferret, weasel, shrew, stoat, badger)
Mink Quickstart — chimera mink REPL, runs/agents subcommands
Mink Providers — backend matrix, env vars, troubleshooting
Otter Quickstart — chimera otter one-shot, REPL, HTTP+SSE, ACP
Ferret Quickstart — chimera ferret sandbox + approval presets, ACP-default serve, cloud bridge
Weasel Quickstart — chimera weasel four modes (REPL / print / RPC / SDK), extensions
Shrew Quickstart — chimera shrew small-model defaults, llama.cpp setup, benchmark harness
Stoat Quickstart — chimera stoat shell-mode toggle, Kimi-first provider chain
Badger Quickstart — chimera badger harness discipline, rerun-on-failure, parity-tracker
Build Your Own Agent — full library guide
All Playbooks — 13 guides covering every feature
Examples — 28 curated runnable scripts across 7 categories
Function Synthesis — compile specs into callable .chi bundles
- 3 runtime backends (llama.cpp, transformers, ONNX), schema validation, streaming invoke
- LocalCompiler for real PEFT fine-tuning; publish and fetch bundles via chimera fs push | pull (Hugging Face Hub + S3)
- 10 CLI sub-verbs: compile, run, list, rm, info, push, pull, import-peft, login, rename
Benchmarks — transparency framework
Benchmark adapters — every adapter under chimera/eval/benchmarks/, status, and how to run
Contributing — setup, workflow, code style
Changelog — version history

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.7.0

May 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chimera_run-0.7.0.tar.gz (3.9 MB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

chimera_run-0.7.0-py3-none-any.whl (1.8 MB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file chimera_run-0.7.0.tar.gz.

File metadata

Download URL: chimera_run-0.7.0.tar.gz
Upload date: May 10, 2026
Size: 3.9 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for chimera_run-0.7.0.tar.gz
Algorithm	Hash digest
SHA256	`7b3d1278a816d68e87c10605869a42b4eb12a10a9b6e57b414d211bf1c9ac5b5`
MD5	`46d3f1a7253176f59d37f9f72b4f8d74`
BLAKE2b-256	`dcd2a54067e836c1dd0a2ed123fc47dfd8ff686c7bfaedbec4d4972c9811b419`

See more details on using hashes here.

File details

Details for the file chimera_run-0.7.0-py3-none-any.whl.

File metadata

Download URL: chimera_run-0.7.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 1.8 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for chimera_run-0.7.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`23f10bc3ad418277e915c202ad6a5710e59437d31802ddb2a058cf0e2d978f10`
MD5	`5016e30420b9c4511d5a34829ae6a910`
BLAKE2b-256	`b2f3fa6032677874300ebc4f00600dc99dc5d06cb7606a9d62dc77e41cd580c3`

See more details on using hashes here.

chimera-run 0.7.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Chimera

Who This Is For

What It Does

Install

Build Your Own Coding Agent

Presets

Architecture

Run It Standalone

Otter — server-first coding agent

Ferret — sandbox-first IDE-flagship coding agent

Weasel — minimal harness with four operating modes

Shrew — coding agent tuned for small local models

Stoat — shell-mode-toggle coding agent

Badger — harness-rewrite coding agent

How It's Organized

Benchmarks

Run It Free with Ollama

When to Reach for Chimera

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes