Skip to main content

Multi-model agent framework with plan-mode orchestration, integrated TUI, and team-state working memory.

Project description

Modulatio

A multi-model agent framework for running long, high-stakes projects with real quality control.

Modulatio orchestrates teams of LLM agents — each on its own model and provider — through plan-mode execution with a real quality gate. Designed for work that takes more than one prompt: long-form drafting, small-business loops, multi-step research, codebase work, anything where output quality matters.

[!WARNING] v0.9.8 — Beta. The Feng-Tui interface, finished. The phosphor theme shipped in v0.9.3; this release lands the layouts across every screen. The list tabs (TICKETS, LOGS, JT LIBRARY, SKILLS, ARTIFACTS) share a controls row with live / search + counts; CONFIG·MODELS, CONFIG·AGENTS, and PROJECTS are configurators (a persistent registry beside a companion pane for the add/edit steps); MEMORY is one unified layered list with add/edit/delete + Markdown export; and two net-new screens arrive — JOBS (a run-folder browser) and DOCS (an offline reader). The CONSOLE is rebuilt into a two-column command floor: a live run-telemetry rail beside the workers' stream on the MOD SQUAD view, a full-width conversation with the Leader on the LEADER view (F4 flips them), and an app-level status-lamp row whose leader/tickets lamps blink for attention while you watch the floor. Jobs launch from the chat by bracketing them — /kickoff <objective> /end (transactional: kept-on-refuse). Attachments are now paste-to-attachCtrl+V a screenshot/image or a copied file path into the composer, which is focused and ready to type the moment it opens. Also: the TUI runs over an SSH login (no local-display dependency), and switching projects refreshes the per-project agent roster. No engine changes — the interface catching up to the design. Builds on v0.9.7 (project management) and v0.9.6 (the team always finishes the job). 4837 tests pass. See the CHANGELOG for the full delta and the roadmap for what's next (1.0: a web UI + remote access). Read the Beta calibration page before serious work. Bug reports + discussions welcome on the issues tab and discussions.

Requires Python 3.12+.


Quick install

git clone https://github.com/ModulatioAI/modulatio.git ~/modulatio
cd ~/modulatio
uv venv && uv pip install -e ".[dev]"
modulatio setup

Full install with troubleshooting at https://modulatio.ai/getting-started/install/.

Linux clipboard. The TUI's copy/paste (Ctrl+C / Ctrl+V) reaches the OS clipboard through a system backend — xclip or wl-clipboard. modulatio setup detects it and offers to install it; or sudo apt install xclip (Debian/Ubuntu) / wl-clipboard (Wayland). macOS and Windows work out of the box. Without a backend, Ctrl+C still copies via OSC 52 (terminal-dependent) and Ctrl+V paste is unavailable.

Why 3.12+? One of the dependencies (lancedb, fastembed, or litellm depending on platform) hasn't published a wheel for older Pythons and falls back to a source build that often fails. If your python3 is 3.11 or older, point the venv at /usr/bin/python3.12 explicitly.


Documentation

Full documentation lives at https://modulatio.ai.

  • Overview — what Modulatio is, who it's for, the orchestration model.
  • v0.1.0 Beta calibration — what the engine does well, what it does NOT do yet. Read before serious work.
  • Getting Started — install, run the setup wizard, ship your first plan.
  • Concepts — the mental model: vault, project, plan, agent, skill, standard.
  • Architecture deep-dives — five-layer working memory, skill system, assembly + review-ledger, sandbox, audit trails.
  • CLI reference — every command, every flag.
  • Roadmap — what's shipping next, what's planned beyond.

What Modulatio does

  • Multi-model routing per agent. Each agent (Leader, QC, every producer) runs on its own configured provider + model. Pick a fast/cheap model for routine work and a stronger one for the gate-class seats — native to the architecture, not an afterthought. Each seat can also carry an ordered list of fallback models: if its model is unavailable (rate limit, auth, 5xx), the engine warns and restarts the whole task on the next backup — never a mid-task switch.
  • Subscription seats — bring your own Claude or GPT-5.5. Beyond API keys, a seat can run on a subscription you already pay for. Clay runs any seat through your Claude Code subscription (claude -p, the official harness — your subscription, never a metered key); GPT-5.5 runs through your OpenAI Codex subscription (the ChatGPT backend). Both reach the model through the vendor's own harness, are confined like any other seat, and are additive to the existing API-key paths. A confined kickoff seat (producer/QC) is held to a fixed set of non-process tools with customizations disabled — it can't spawn a hidden crew or escape its sandbox — while the interactive Leader lane keeps its full loadout.
  • Configure everything from the TUI. A Configuration tab wires up providers, models, API keys, and agents without touching a config file: pick a provider and a model and base_url / auth / model-id auto-fill — you type only a key. Catalogs thirteen providers (OpenRouter, Ollama Cloud, xAI, Anthropic, OpenAI, NVIDIA, Google, three locals, custom — plus two subscription seats: GPT-5.5 via OpenAI Codex, and Claude via Clay) with free-tier models honestly caveated. A Providers & keys manager lists each provider's keys (by label, never the value) to add or remove; add and remove models and agents. Only the Leader is required — a QC verifier and producers are optional, so you can run a solo Leader or a full swarm. Subscription models that reason (GPT-5.5/Codex) expose a reasoning-effort picker (xhigh/high/medium/low).
  • Key-pool — your own keys, pooled by default. A provider's keys form one shared floating pool: every model on it rotates across the keys (so a swarm of producers spreads load instead of hammering one rate-limited key) and a 429 fails over to the next. Need a budget? Pin a key to a model — it then serves only that model and leaves the pool, so its spend stays isolated (the provider meters per key; distinct keys become your accounting buckets). Modulatio meters by key, not in the router. Pool your own legit keys; never throwaway accounts. See the key-pool doc.
  • Talk to the Leader. Beyond batch kickoffs, the Leader is an agent you converse with — ask him to answer, analyze, fetch the web, author a skill, or run a job and command the producer swarm, all in one lane, streaming back live. Drive the thread with commands — /new archives the conversation aside (kept, never deleted) and starts fresh, /editor composes a message in your $EDITOR, /models opens the picker — and ESC interrupts him mid-thought, cleanly, at the next step.
  • The Leader works solo, too. Beyond commanding the swarm, the Leader can pair with you directly as a standalone coding agent — read, edit, and run files in a folder you point it at with /work <path> (pytest, builds, git). It's confined by default to its own per-project workspace — a structural cheat-guard, it physically can't touch the team's deliverables — and widening it to a real folder is an explicit, scoped approval (once / session / always / deny; /rp revokes everything; sandbox-required, fail-closed for anything it runs). Turn it loose within bounds with autonomy modes — /yolo (auto-grant capabilities), /goal (delegate judgment), /yolo-goal (both) — while one fence holds through every mode: crossing into a new folder always needs your /work approval.
  • Diagnostics you can send in one step. Crashes, handled failures, and a doctor read are captured to local logs (each kind named in its file), browsable in a LOGS tab or via modulatio logs — review one and send it to the team as a GitHub issue. Capture-always, submit-on-consent: nothing is auto-filed, and every log is auto-redacted (secrets, tokens, Authorization headers) and shown to you before it's sent.
  • Many projects, one install. A single install runs more than one line of work. Switch the active project from the CLI (modulatio project list / project use <code>) or a PROJECTS tab (browse, switch, create, delete); switching is live and in-place, disabled while a job runs. Your team carries install-wide — the same agents and models everywhere — while each project keeps its own memory, deliverables, tickets, and history. Creating a project seeds your team into it in one click; deleting one backs it up first and is guarded against the obvious accidents.
  • Clean install lifecycle — modulatio repair and modulatio uninstall. Repair a broken setup (rebuild missing presets/agents, recreate a missing vault or project, clear configuration in tiers) or remove Modulatio entirely with named choices (settings, project folders, deliverables, pandoc) and a --pristine full reset. Your own data is backed up before removal, and a vault Modulatio didn't create — your own notes folder — is never auto-deleted, even under --pristine.
  • Quality control as a first-class subsystem. Three-layer TQM (universal axes × per-artifact-kind standards × per-team overrides). QC reviews every artifact; rejects route back to producers in GENERATE / EDIT / DIFF mode.
  • QC-as-fixer (on by default). Cheap, fast producers generate the bulk of the work; the smarter QC reviews it and patches only the errors — the cost of a cheap model with the quality of a strong one (speculative decoding, applied to agents). When a producer can't clear the bar, QC authors the fix from its own findings and the task completes. Bundled default standards give QC a real bar from a cold start. Opt out with MODULATIO_QC_FIXER=0.
  • Product Quality Report. Every run ships an advisory note (.docx) in the project lead's own voice — what it stands behind and what it recommends you double-check. Honest caveats, never a gate: reservations the swarm can't resolve are surfaced here, never block the work or open a ticket.
  • Finished products, delivered — one folder per job. Producers write Markdown; the lead's tagged deliverables render to .docx, human-named from the document title, into a per-job folder under ~/Documents/Modulatio/<project>/ (named from the job and date, with a hex tiebreaker only on collision) so each run keeps its own products instead of overwriting the last. The Product Quality Report ships inside the same folder. When a renderer isn't installed, products ship as Markdown with a note rather than failing.
  • Assemble the product, not just the document. A multi-piece deliverable is joined by a family of assemblers chosen by the artifact's kind — document (ordered text), code (preserve the file tree + generate a wiring index), data (a real JSON/CSV merge), media (image/audio/video/bundle via a local compositor). The producer emits a small plan (a manifest); the engine owns the join, so unit bytes never round-trip through the model and a large deliverable can't truncate. Underneath, a content-addressed review-ledger lets QC verify a finished deliverable by its marks (each unit passed, bytes unchanged, the set matches the dependency graph) instead of re-reading the whole thing into a blown budget. Every family now has a deterministic containment oracle — a provably-correct assembly passes QC cheaply, without the bytes ever re-entering the model: document/data structurally, code by static wiring checks, a media bundle by exact byte equality; a lossy video/audio/image composite honestly falls back to the full review rather than claim a proof it can't back. See Assembly + the review-ledger.
  • Verify the whole deliverable, not just the parts. A declared DeliverableSpec (per-part floor, required structure, title) is carried from the job template into the run, and the engine checks the assembled whole against it — giving the verifier real eyes (an engine-extracted structural digest + a readable text twin, never binary bytes the model can't read), binding the per-part floor at produce-time (on the assembler's real part set, never a front-matter page), generating the framing (title + table of contents), and normalizing part numbering to a clean 1..N. Every move is a per-family dispatch — document-first, every other family a graceful no-op — so it stays product- and agent-agnostic. See Deliverable fidelity.
  • Metered tools, gated before they spend. Every built-in tool is free-local and unmetered; a tool can opt into a paid tier (cost_class), and the Comptroller gates each metered call before it spends — fail-closed on missing budget, per-task + daily caps, idempotent, narrow params, only ever on QC-passed pinned inputs. No SaaS lock-in: the tier ships as a mechanism with no provider or key.
  • Job Templates — setup that sticks. For work you do more than once, the Leader can codify a Job Template: its own interview, parameter schema, and output contract for that class of job — domain-agnostic (a single report, an N-piece anthology, a per-competitor brief are all the same primitive over a generic output cardinality). Bind it to a concrete answer set and it runs headless on a schedule — every cron job is a bound template, validated when you add it, never failing at 3am. The team notices when you keep running the same kind of job (or redo one) and offers to template it: the setup-side mirror of skill self-codification.
  • A producer is a model endpoint that learns. No fixed roles and no skills to assign — give a producer an LLM and tag what it's good at; the team composes the skills each task needs from a shared, git-versioned library at run-time, and routing never blocks on a capability gap. When the same defect keeps recurring, the team codifies the correction into durable skill guidance that cheap producers load next time — and it learns the other direction too: when the smart QC keeps rescuing a producer by writing the fix it couldn't, the team codifies that recurring technique (project-local, flagged as a non-independent fix worth a spot-check), so the cheap producer learns to do it itself. It gets quietly better at the work you give it.
  • Plan-mode end-to-end. Leader is a conversational partner, plan is the unit of execution, daemon-driven async, Telegram approvals, full audit trail.
  • Open architecture. Your data, your vault, your providers, your models. No SaaS, no per-instance subscription.

Project structure

modulatio/
├── src/modulatio/    # Source — agents, runners, daemon, TUI, CLI
├── tests/            # Pytest suite (3046 tests)
├── scripts/          # Build / release scripts
└── pyproject.toml    # Package metadata + deps

Documentation lives in its own repo (the Modulatio docs site) so it can be deployed to https://modulatio.ai independently of code releases.


License

Apache License, Version 2.0. Relicensed from AGPL-3.0-or-later prior to v0.1.0.


Contributing

Issues and pull requests welcome at https://github.com/ModulatioAI/modulatio. See CONTRIBUTING.md for the contribution guide. Three GitHub issue templates are wired (Bug report / Regression / Feature request) plus a labelset for severity / component / status / regression — file issues using the templates so the labels apply correctly.

Contributions are accepted under the project's Apache-2.0 license (see LICENSE). By submitting a contribution, you affirm you have the right to do so under those terms.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modulatio-0.9.8.0.tar.gz (2.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

modulatio-0.9.8.0-py3-none-any.whl (1.2 MB view details)

Uploaded Python 3

File details

Details for the file modulatio-0.9.8.0.tar.gz.

File metadata

  • Download URL: modulatio-0.9.8.0.tar.gz
  • Upload date:
  • Size: 2.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for modulatio-0.9.8.0.tar.gz
Algorithm Hash digest
SHA256 da7f8dd89447a0ab5b93af63935e49efe9556d03dc527d66ab17d5781dae1b21
MD5 0093cf5e424561881a3627f5dc5e22f3
BLAKE2b-256 218bd378bddd36bd3ffdd9b6f953e8c8afbe487e620074530f80ca64e8637c17

See more details on using hashes here.

File details

Details for the file modulatio-0.9.8.0-py3-none-any.whl.

File metadata

  • Download URL: modulatio-0.9.8.0-py3-none-any.whl
  • Upload date:
  • Size: 1.2 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for modulatio-0.9.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3b7eae74c3447008b62c610ab6f96bd2d9b3e4bd2c96926a9ffe7b1956c0378c
MD5 0198ab9ee00f7555b02218df7abe98b8
BLAKE2b-256 416170a345f0b6a7f259abe93aafdcd80b5e904b884f77195b4c190640978b86

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page