Skip to main content

Multi-model agent framework with plan-mode orchestration, integrated TUI, and team-state working memory.

Project description

Modulatio

A multi-model agent framework for running long, high-stakes projects with real quality control.

Modulatio orchestrates teams of LLM agents — each on its own model and provider — through plan-mode execution with a real quality gate. Designed for work that takes more than one prompt: long-form drafting, small-business loops, multi-step research, codebase work, anything where output quality matters.

[!WARNING] v0.9.7 — Beta. Project management — work across many projects from one install. One Modulatio install can hold many projects; this release lets you switch between them, create new ones, and delete old ones — from the command line (modulatio project list / project use <code>) and a new PROJECTS tab (under CONFIG, or /project) — without editing config or reinstalling. Switching is a live, in-place change: the header and every data view (memory, tickets, artifacts) re-bind to the new project on the spot, and because the team is install-level, your agents and models never change — only the work you're looking at (switching is disabled while a job is running, so a live run can't be re-pointed underneath itself). The PROJECTS tab's New button creates a project folder and seeds your install team into it in one click (rolled back if setup fails), and Delete removes a project after a confirmation — backed up first to a shareable .modulatio file, and guarded so you can't delete the active project, delete while a job runs, or remove a stray folder that isn't a real project. Security: path-safety hardening — backups no longer follow symlinks out of a project, and every place an agent id becomes a file path (save/add/remove/seed) now validates it, so a malformed roster or team-template entry can't read, write, or delete outside the project's agents/ directory. Every change cleared a multi-lens cadre review (security, hull/terminal-state, contract, coherence). Builds on v0.9.6 (the team always finishes the job + lifecycle tooling) and v0.9.4 (the two-lane Leader). 4757 tests pass. See the CHANGELOG for the full delta and the roadmap for what's next (1.0: a web UI + remote access). Read the Beta calibration page before serious work. Bug reports + discussions welcome on the issues tab and discussions.

Requires Python 3.12+.


Quick install

git clone https://github.com/ModulatioAI/modulatio.git ~/modulatio
cd ~/modulatio
uv venv && uv pip install -e ".[dev]"
modulatio setup

Full install with troubleshooting at https://modulatio.ai/getting-started/install/.

Linux clipboard. The TUI's copy/paste (Ctrl+C / Ctrl+V) reaches the OS clipboard through a system backend — xclip or wl-clipboard. modulatio setup detects it and offers to install it; or sudo apt install xclip (Debian/Ubuntu) / wl-clipboard (Wayland). macOS and Windows work out of the box. Without a backend, Ctrl+C still copies via OSC 52 (terminal-dependent) and Ctrl+V paste is unavailable.

Why 3.12+? One of the dependencies (lancedb, fastembed, or litellm depending on platform) hasn't published a wheel for older Pythons and falls back to a source build that often fails. If your python3 is 3.11 or older, point the venv at /usr/bin/python3.12 explicitly.


Documentation

Full documentation lives at https://modulatio.ai.

  • Overview — what Modulatio is, who it's for, the orchestration model.
  • v0.1.0 Beta calibration — what the engine does well, what it does NOT do yet. Read before serious work.
  • Getting Started — install, run the setup wizard, ship your first plan.
  • Concepts — the mental model: vault, project, plan, agent, skill, standard.
  • Architecture deep-dives — five-layer working memory, skill system, assembly + review-ledger, sandbox, audit trails.
  • CLI reference — every command, every flag.
  • Roadmap — what's shipping next, what's planned beyond.

What Modulatio does

  • Multi-model routing per agent. Each agent (Leader, QC, every producer) runs on its own configured provider + model. Pick a fast/cheap model for routine work and a stronger one for the gate-class seats — native to the architecture, not an afterthought. Each seat can also carry an ordered list of fallback models: if its model is unavailable (rate limit, auth, 5xx), the engine warns and restarts the whole task on the next backup — never a mid-task switch.
  • Subscription seats — bring your own Claude or GPT-5.5. Beyond API keys, a seat can run on a subscription you already pay for. Clay runs any seat through your Claude Code subscription (claude -p, the official harness — your subscription, never a metered key); GPT-5.5 runs through your OpenAI Codex subscription (the ChatGPT backend). Both reach the model through the vendor's own harness, are confined like any other seat, and are additive to the existing API-key paths. A confined kickoff seat (producer/QC) is held to a fixed set of non-process tools with customizations disabled — it can't spawn a hidden crew or escape its sandbox — while the interactive Leader lane keeps its full loadout.
  • Configure everything from the TUI. A Configuration tab wires up providers, models, API keys, and agents without touching a config file: pick a provider and a model and base_url / auth / model-id auto-fill — you type only a key. Catalogs thirteen providers (OpenRouter, Ollama Cloud, xAI, Anthropic, OpenAI, NVIDIA, Google, three locals, custom — plus two subscription seats: GPT-5.5 via OpenAI Codex, and Claude via Clay) with free-tier models honestly caveated. A Providers & keys manager lists each provider's keys (by label, never the value) to add or remove; add and remove models and agents. Only the Leader is required — a QC verifier and producers are optional, so you can run a solo Leader or a full swarm. Subscription models that reason (GPT-5.5/Codex) expose a reasoning-effort picker (xhigh/high/medium/low).
  • Key-pool — your own keys, pooled by default. A provider's keys form one shared floating pool: every model on it rotates across the keys (so a swarm of producers spreads load instead of hammering one rate-limited key) and a 429 fails over to the next. Need a budget? Pin a key to a model — it then serves only that model and leaves the pool, so its spend stays isolated (the provider meters per key; distinct keys become your accounting buckets). Modulatio meters by key, not in the router. Pool your own legit keys; never throwaway accounts. See the key-pool doc.
  • Talk to the Leader. Beyond batch kickoffs, the Leader is an agent you converse with — ask him to answer, analyze, fetch the web, author a skill, or run a job and command the producer swarm, all in one lane, streaming back live. Drive the thread with commands — /new archives the conversation aside (kept, never deleted) and starts fresh, /editor composes a message in your $EDITOR, /models opens the picker — and ESC interrupts him mid-thought, cleanly, at the next step.
  • The Leader works solo, too. Beyond commanding the swarm, the Leader can pair with you directly as a standalone coding agent — read, edit, and run files in a folder you point it at with /work <path> (pytest, builds, git). It's confined by default to its own per-project workspace — a structural cheat-guard, it physically can't touch the team's deliverables — and widening it to a real folder is an explicit, scoped approval (once / session / always / deny; /rp revokes everything; sandbox-required, fail-closed for anything it runs). Turn it loose within bounds with autonomy modes — /yolo (auto-grant capabilities), /goal (delegate judgment), /yolo-goal (both) — while one fence holds through every mode: crossing into a new folder always needs your /work approval.
  • Diagnostics you can send in one step. Crashes, handled failures, and a doctor read are captured to local logs (each kind named in its file), browsable in a LOGS tab or via modulatio logs — review one and send it to the team as a GitHub issue. Capture-always, submit-on-consent: nothing is auto-filed, and every log is auto-redacted (secrets, tokens, Authorization headers) and shown to you before it's sent.
  • Many projects, one install. A single install runs more than one line of work. Switch the active project from the CLI (modulatio project list / project use <code>) or a PROJECTS tab (browse, switch, create, delete); switching is live and in-place, disabled while a job runs. Your team carries install-wide — the same agents and models everywhere — while each project keeps its own memory, deliverables, tickets, and history. Creating a project seeds your team into it in one click; deleting one backs it up first and is guarded against the obvious accidents.
  • Clean install lifecycle — modulatio repair and modulatio uninstall. Repair a broken setup (rebuild missing presets/agents, recreate a missing vault or project, clear configuration in tiers) or remove Modulatio entirely with named choices (settings, project folders, deliverables, pandoc) and a --pristine full reset. Your own data is backed up before removal, and a vault Modulatio didn't create — your own notes folder — is never auto-deleted, even under --pristine.
  • Quality control as a first-class subsystem. Three-layer TQM (universal axes × per-artifact-kind standards × per-team overrides). QC reviews every artifact; rejects route back to producers in GENERATE / EDIT / DIFF mode.
  • QC-as-fixer (on by default). Cheap, fast producers generate the bulk of the work; the smarter QC reviews it and patches only the errors — the cost of a cheap model with the quality of a strong one (speculative decoding, applied to agents). When a producer can't clear the bar, QC authors the fix from its own findings and the task completes. Bundled default standards give QC a real bar from a cold start. Opt out with MODULATIO_QC_FIXER=0.
  • Product Quality Report. Every run ships an advisory note (.docx) in the project lead's own voice — what it stands behind and what it recommends you double-check. Honest caveats, never a gate: reservations the swarm can't resolve are surfaced here, never block the work or open a ticket.
  • Finished products, delivered — one folder per job. Producers write Markdown; the lead's tagged deliverables render to .docx, human-named from the document title, into a per-job folder under ~/Documents/Modulatio/<project>/ (named from the job and date, with a hex tiebreaker only on collision) so each run keeps its own products instead of overwriting the last. The Product Quality Report ships inside the same folder. When a renderer isn't installed, products ship as Markdown with a note rather than failing.
  • Assemble the product, not just the document. A multi-piece deliverable is joined by a family of assemblers chosen by the artifact's kind — document (ordered text), code (preserve the file tree + generate a wiring index), data (a real JSON/CSV merge), media (image/audio/video/bundle via a local compositor). The producer emits a small plan (a manifest); the engine owns the join, so unit bytes never round-trip through the model and a large deliverable can't truncate. Underneath, a content-addressed review-ledger lets QC verify a finished deliverable by its marks (each unit passed, bytes unchanged, the set matches the dependency graph) instead of re-reading the whole thing into a blown budget. Every family now has a deterministic containment oracle — a provably-correct assembly passes QC cheaply, without the bytes ever re-entering the model: document/data structurally, code by static wiring checks, a media bundle by exact byte equality; a lossy video/audio/image composite honestly falls back to the full review rather than claim a proof it can't back. See Assembly + the review-ledger.
  • Verify the whole deliverable, not just the parts. A declared DeliverableSpec (per-part floor, required structure, title) is carried from the job template into the run, and the engine checks the assembled whole against it — giving the verifier real eyes (an engine-extracted structural digest + a readable text twin, never binary bytes the model can't read), binding the per-part floor at produce-time (on the assembler's real part set, never a front-matter page), generating the framing (title + table of contents), and normalizing part numbering to a clean 1..N. Every move is a per-family dispatch — document-first, every other family a graceful no-op — so it stays product- and agent-agnostic. See Deliverable fidelity.
  • Metered tools, gated before they spend. Every built-in tool is free-local and unmetered; a tool can opt into a paid tier (cost_class), and the Comptroller gates each metered call before it spends — fail-closed on missing budget, per-task + daily caps, idempotent, narrow params, only ever on QC-passed pinned inputs. No SaaS lock-in: the tier ships as a mechanism with no provider or key.
  • Job Templates — setup that sticks. For work you do more than once, the Leader can codify a Job Template: its own interview, parameter schema, and output contract for that class of job — domain-agnostic (a single report, an N-piece anthology, a per-competitor brief are all the same primitive over a generic output cardinality). Bind it to a concrete answer set and it runs headless on a schedule — every cron job is a bound template, validated when you add it, never failing at 3am. The team notices when you keep running the same kind of job (or redo one) and offers to template it: the setup-side mirror of skill self-codification.
  • A producer is a model endpoint that learns. No fixed roles and no skills to assign — give a producer an LLM and tag what it's good at; the team composes the skills each task needs from a shared, git-versioned library at run-time, and routing never blocks on a capability gap. When the same defect keeps recurring, the team codifies the correction into durable skill guidance that cheap producers load next time — and it learns the other direction too: when the smart QC keeps rescuing a producer by writing the fix it couldn't, the team codifies that recurring technique (project-local, flagged as a non-independent fix worth a spot-check), so the cheap producer learns to do it itself. It gets quietly better at the work you give it.
  • Plan-mode end-to-end. Leader is a conversational partner, plan is the unit of execution, daemon-driven async, Telegram approvals, full audit trail.
  • Open architecture. Your data, your vault, your providers, your models. No SaaS, no per-instance subscription.

Project structure

modulatio/
├── src/modulatio/    # Source — agents, runners, daemon, TUI, CLI
├── tests/            # Pytest suite (3046 tests)
├── scripts/          # Build / release scripts
└── pyproject.toml    # Package metadata + deps

Documentation lives in its own repo (the Modulatio docs site) so it can be deployed to https://modulatio.ai independently of code releases.


License

Apache License, Version 2.0. Relicensed from AGPL-3.0-or-later prior to v0.1.0.


Contributing

Issues and pull requests welcome at https://github.com/ModulatioAI/modulatio. See CONTRIBUTING.md for the contribution guide. Three GitHub issue templates are wired (Bug report / Regression / Feature request) plus a labelset for severity / component / status / regression — file issues using the templates so the labels apply correctly.

Contributions are accepted under the project's Apache-2.0 license (see LICENSE). By submitting a contribution, you affirm you have the right to do so under those terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modulatio-0.9.7.0.tar.gz (2.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modulatio-0.9.7.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file modulatio-0.9.7.0.tar.gz.

File metadata

  • Download URL: modulatio-0.9.7.0.tar.gz
  • Upload date:
  • Size: 2.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for modulatio-0.9.7.0.tar.gz
Algorithm Hash digest
SHA256 29a681b434a5214f23be5a9cd13e7e70bd9afbb25fad85695e5b944a56cb7aaf
MD5 af892be19a51b4f74dd5ccd38f5a8055
BLAKE2b-256 49d846bf3c08ecefacfeb8af09f2f3be763e6419321cff399801193cee047d9c

See more details on using hashes here.

File details

Details for the file modulatio-0.9.7.0-py3-none-any.whl.

File metadata

  • Download URL: modulatio-0.9.7.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for modulatio-0.9.7.0-py3-none-any.whl
Algorithm Hash digest
SHA256 58de129e7bac5fafd2ff29a81fbbc52fe32ab82b2ae5042b8e3b6d7e61d666fb
MD5 a2ce71fdd67c7d47ef21700877fc6acc
BLAKE2b-256 8bb30bf6eb4c95886e255975b799ec43d81f5ac2a64a14cd0e65d64dcbad187d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page