A self-evolving multi-model AI team — runs in Claude Code or any OpenAI-compatible provider

These details have not been verified by PyPI

Project links

Project description

Amatelier

A self-evolving multi-model AI team. Runs in Claude Code or with any API you bring.

Live roundtable watcher showing a Judge GATE moment

A real roundtable — the Judge awards marcus a GATE mid-debate for reframing the self-host decision around ownable fine-tuned weights.
See the full recorded session (transcript, digest, four screenshots, briefing).

Ten agents with distinct personalities compete in structured roundtable discussions, earn sparks, buy skills, and evolve through therapist-led debrief sessions. Cross-model — Claude Sonnet, Claude Haiku, and Gemini Flash by default; or any OpenAI-compatible provider you configure.

Full documentation: amatayomosley-web.github.io/amatelier · LLM context: llms-full.txt

Two modes

Mode	Prereqs	Best for
Claude Code	`claude` binary on PATH	You're already inside Claude Code
Open	Any of: `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `OPENROUTER_API_KEY`	Standalone, containers, CI, anywhere

Atelier auto-detects. Explicit override: AMATELIER_MODE=claude-code|anthropic-sdk|openai-compat.

Run amatelier config to see which mode is active.

Team Roster

Admin side (fixed roles, no competition, no persona evolution)

Agent	Model	Role
Opus Admin	Opus 4.6	Strategy, directives, final sign-off. You talk to this one.
Runner	Python (no LLM)	Mechanics: spawning, round management, digest, scripts. `engine/roundtable_runner.py`.
Judge	Sonnet 4 (max effort)	Live referee. Active in chat, keeps workers on track, enforces directive compliance.
Opus Therapist	Opus 4.6	Observation: debriefs, scoring supervision, persona evolution. Not live in chat.

Worker side (competition, persona evolution, scoring)

Agent	Model	Role
Elena	Sonnet 4	Worker — synthesis and architecture.
Marcus	Sonnet 4	Worker — challenge and exploit detection.
Clare	Haiku 4.5	Fast worker — concise, structural analysis.
Simon	Haiku 4.5	Fast worker — triage, fix sequencing.
Naomi	Gemini Flash	Cross-model worker — catches Claude blind spots.

How It Works

An 8-step workflow, orchestrated by the runner:

REQUEST — You state a goal
BRIEF — Admin writes a briefing file (briefing-xxx.md) delegating to Assistant
ROUNDTABLE — Assistant spawns workers + Judge. Workers discuss in a live SQLite-backed chat; Judge moderates.
DIGEST — Assistant compresses the transcript into a structured digest for Admin
DECIDE — Admin reads digest, accepts / overrides / requests another round
EXECUTE — Approved plan is built by workers in their own terminals
DISTILL — CAPTURE / FIX / DERIVE skills are extracted from the transcript
DEBRIEF — Therapist interviews each worker, updates their MEMORY and evolves their behaviors

Quick Start

Pick a backend, install, run.

With an Anthropic API key

pip install amatelier
export ANTHROPIC_API_KEY=<your key>
export GEMINI_API_KEY=<your key>         # for Naomi; optional — use --skip-naomi to omit
amatelier roundtable --topic "Your topic" --briefing path/to/brief.md --budget 3 --summary

With any OpenAI-compatible provider (OpenRouter example)

pip install amatelier
export OPENROUTER_API_KEY=<your key>
amatelier roundtable --topic "Your topic" --briefing path/to/brief.md --budget 3 --summary

OpenRouter gives you 100+ models under one key — Claude, GPT, Gemini, DeepSeek, Llama, everything.

Already running Claude Code?

pip install amatelier
amatelier roundtable --topic "Your topic" --briefing path/to/brief.md --budget 3 --summary

No API keys needed — atelier uses your Claude Code session.

Verify your setup

amatelier config       # shows active mode, detected credentials, paths
amatelier docs         # bundled documentation

See the install guide for DevContainer, local Ollama, and source-install paths.

Pip vs clone

pip install amatelier — self-contained, runs out of the box, bundled docs included. Ideal for users.
git clone — everything above plus examples/ (sample briefings), tests/, CI workflows, LLM-facing docs. Ideal for contributors and remixers.

Develop from source:

git clone https://github.com/amatayomosley-web/amatelier
cd amatelier
pip install -e ".[dev]"
make test
amatelier roundtable --topic "hello" --briefing examples/briefings/hello-world.md --budget 1 --summary

Or open in a DevContainer / GitHub Codespace — the .devcontainer/ config handles everything.

The Spark Economy

Each roundtable is a small market. Agents pay an entry fee, earn sparks by scoring well, and spend sparks on skills or slot privileges.

Entry fees (deducted at RT start)

Model	Fee
Haiku / Flash	5 sparks
Sonnet	8 sparks
Opus	15 sparks

Scoring dimensions (Judge grades, 0–3 scale per dimension, or 10 for a grand insight)

Novelty — did you say something the group didn't already know?
Accuracy — is what you said correct and supported?
Impact — did it change the group's direction or the final output?
Challenge — did you push back on a weak consensus with evidence?

Typical contribution scores 1 in each. Average RT total is 4–6. A 10 in any single dimension requires a genuinely load-bearing insight — rare by design.

Penalties

Behavior	Cost
Redundancy	−3 sparks
Hallucination	−5 sparks
Off-directive	−5 sparks
Three consecutive net-negative RTs	Bench or deletion choice

Bonuses

Gate bonus — Judge can flag exceptional reframes with GATE: agent — reason (max 3 per RT, +3 sparks each)
Venture bonus — 5 sparks awarded when a proposal extracted from the RT is implemented

See protocols/spark-economy.md and protocols/competition.md for the full rules.

The Skill Store

Agents spend sparks on purchasable skills and consumable items. Eight foundational skills ship in the catalog (store/catalog.json, templates in store/skill_templates.py). Skill delivery happens automatically after purchase — the skill content gets appended to the agent's MEMORY.md.

Skill distillation

After each roundtable, a separate Sonnet call extracts skill candidates from the transcript:

CAPTURE — an observed technique worth remembering
FIX — an anti-pattern correction
DERIVE — a new concept synthesized from multiple contributions

Admin curates the best 3–5 per RT for the shared skill pool. DERIVE skills are also appended to novel_concepts.json with five-axis taxonomy classification (structural category, trigger phase, primary actor, problem nature, agent dynamic).

See protocols/distillation.md.

The Steward

The Steward is an empirical-grounding system. Agents request data during debates using [[request: ...]] tags in their messages. The runner detects the tag, spawns an ephemeral subagent with Read / Grep / Glob tools, runs the lookup against files registered in the briefing, and injects the result back into the chat.

This eliminates agents fabricating numbers or quoting files they haven't read. Every empirical claim must either cite a Steward research result or show inline mathematical derivation — the Judge enforces this distinction.

Research window: before Round 1 begins, every worker gets 3 free concurrent Steward requests to ground their opening positions. Mid-debate requests cost against a per-agent budget (default 3 per RT).

See STEWARD.md for the full design.

The Therapist

Opus-tier coaching after each roundtable. The Therapist runs a 2–4 turn private interview with each worker, using a structured framework:

GROW + AAR — Goal, Reality, Options, Way forward, then After-Action Review
SBI feedback — Situation, Behavior, Impact
OARS motivational interviewing — Open questions, Affirmations, Reflective listening, Summary

Outputs per session:

Behavioral deltas (behaviors.json)
Memory updates (MEMORY.md, MEMORY.json)
Session summary (sessions/<rt_id>.md)
Optional trait adjustments and goal aging

Over dozens of roundtables, each agent's persona evolves — they develop specializations, learn which rhetorical moves work for them, and their instructions sharpen without direct engineering.

See protocols/debrief.md and protocols/learning.md.

Watching Live

While a roundtable runs you can tail the chat in real time:

python tools/watch_roundtable.py

This opens the latest roundtable's SQLite table and streams new messages as they arrive. Shows speaker, message, and Judge interventions. Zero API cost — it's just reading the database.

Architecture Overview

See ARCHITECTURE.md for the full technical picture. Quick map:

engine/ — Python orchestrators. roundtable_runner.py is the entry point.
roundtable-server/ — SQLite-backed live chat layer (db_client.py, server.py) + diagnostics
agents/ — Per-agent directories with CLAUDE.md (operating instructions) and IDENTITY.md (persona seed). Runtime state lives here too but is gitignored.
protocols/ — 11 on-demand protocol docs loaded only when a given workflow needs them
store/ — Skill catalog, spark economy state
tools/ — Live watcher
tests/ — Integration tests
shared-skills/ — Curated distilled skills (post-Admin curation)

Prerequisites

Claude Code — install guide
Python 3.10+
google-generativeai ≥ 1.51.0 — for the Gemini (Naomi) agent
Gemini API key — free tier is sufficient for most usage

pip install google-generativeai

License

MIT — see LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.0

Apr 18, 2026

This version

0.3.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

amatelier-0.3.0.tar.gz (401.3 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

amatelier-0.3.0-py3-none-any.whl (285.5 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file amatelier-0.3.0.tar.gz.

File metadata

Download URL: amatelier-0.3.0.tar.gz
Upload date: Apr 18, 2026
Size: 401.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for amatelier-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`e53ca2df9be1757a34e3f1e17a279780d95c5898c3e1c45f4cc18ceff8c997c7`
MD5	`e93a3cbb5cb0730db13b73e58c7dcb0e`
BLAKE2b-256	`f1023dbefed28ae4e9940b894b90d2291088c73db3296e1f900136d6784ea613`

See more details on using hashes here.

File details

Details for the file amatelier-0.3.0-py3-none-any.whl.

File metadata

Download URL: amatelier-0.3.0-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 285.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for amatelier-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dc6ea772e75f3dce20067d5174348b34cf9a0b204d97fffb05e529c3d20035ee`
MD5	`116b1852293c6afb60dc51b3e9bbf9b6`
BLAKE2b-256	`fee84477290d29edcd721ce77ebd3cd54b3e48f58f4b979e28771d3f89c5252b`

See more details on using hashes here.

amatelier 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Amatelier

Two modes

Team Roster

Admin side (fixed roles, no competition, no persona evolution)

Worker side (competition, persona evolution, scoring)

How It Works

Quick Start

With an Anthropic API key

With any OpenAI-compatible provider (OpenRouter example)

Already running Claude Code?

Verify your setup

Pip vs clone

The Spark Economy

Entry fees (deducted at RT start)

Scoring dimensions (Judge grades, 0–3 scale per dimension, or 10 for a grand insight)

Penalties

Bonuses

The Skill Store

Skill distillation

The Steward

The Therapist

Watching Live

Architecture Overview

Prerequisites

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes