crtx · PyPI

Multi-model AI development intelligence — learns your codebase, gets smarter every session

These details have not been verified by PyPI

Project links

Project description

CRTX

Multi-model AI that learns your codebase.

Quick Start • Getting Started • How It Works • The Arbiter • Supported Models • Architecture • Contributing

Python 3.12+ License Any LLM

The Problem

You paste code into an AI model. It looks right. You ship it. Then you find the hallucinated import, the missed edge case, the pattern violation that cascades through your codebase.

Single-model code generation has a blindspot problem. Every model has biases, gaps, and failure modes — and the same model that wrote the bug can't reliably find it.

The Fix

CRTX routes your coding task through multiple AI models in specialized roles, with an independent referee that catches mistakes before they reach your codebase.

Task -> [Architect] -> [Implementer] -> [Refactor] -> [Verify] -> Production Code
             |              |              |             |
         Arbiter         Arbiter        Arbiter       Arbiter
      (different model reviews each stage)

Each model does what it's best at. A different model checks the work. The code that survives is production-ready.

Quick Start

pip install crtx
crtx setup          # Interactive API key configuration
crtx                # Launch interactive session

That's it. crtx setup walks you through API key configuration with live validation. crtx launches an interactive session with a branded terminal UI, real-time pipeline status, and a persistent REPL.

Getting Started

First-Time Setup

crtx setup

Interactive wizard that prompts for API keys (Anthropic, OpenAI, Google, xAI), validates each key against its provider, and saves them to ~/.crtx/keys.env. You need at least one provider configured. For parallel and debate modes, you'll need at least two.

crtx setup --check    # Validate existing keys without re-prompting
crtx setup --reset    # Clear saved keys and reconfigure

Interactive Session (REPL)

crtx

Launches the interactive REPL. The REPL maintains session state — set your mode, routing strategy, and arbiter depth once, then run multiple tasks without repeating flags.

crtx ▸ mode parallel
  Mode set to parallel

crtx ▸ route quality_first
  Route set to quality_first

crtx ▸ Build a REST API with JWT authentication and rate limiting
  # → Interactive config screen → real-time pipeline display → completion summary

crtx ▸ status
  Mode:    parallel
  Route:   quality_first
  Arbiter: bookend

Type help for all commands, exit or Ctrl+C to quit.

Direct Execution

# Run with interactive config screen (choose mode/route/arbiter before launch)
crtx run "Build a REST API with JWT authentication and rate limiting"

# Run with explicit flags (skips config screen)
crtx run "Build a REST API" --mode sequential --route hybrid --arbiter bookend

When you run without explicit flags, CRTX shows an interactive config screen where you can cycle through modes, routing strategies, and arbiter settings with single keypresses before confirming. With explicit flags, the pipeline starts immediately.

How It Works

CRTX uses a sequential pipeline with four stages. Each stage is handled by whichever model scores highest for that role:

Stage	Role	What It Does
Architect	Design the solution	Produces a technical scaffold: file structure, interfaces, data models, dependency map.
Implement	Write the code	Takes the scaffold and produces complete, working implementation with error handling.
Refactor	Improve and test	Restructures for clarity, adds edge case handling, writes comprehensive test suite.
Verify	Validate everything	Reviews the complete output for correctness, security, and pattern compliance.

Models don't just hand off and move on — any model can suggest improvements outside its assigned role. The Architect can flag an implementation concern. The Implementer can propose a structural change. Suggestions are tracked, evaluated, and either accepted or escalated to consensus.

The Arbiter

The Arbiter is what makes CRTX fundamentally different from running the same prompt through multiple models.

It's an independent referee. The Arbiter never writes code. It never proposes architecture. Its only job is to find what's wrong with other models' work.

It's always a different model. If Claude wrote the code, GPT-4 or Grok arbitrates. If Gemini designed the architecture, Claude checks it. The system enforces this automatically — the same model never grades its own work.

It assumes there are bugs. The Arbiter's prompt starts from skepticism: "Assume there are errors until proven otherwise." This inverts the typical AI review pattern where models default to "looks good" and hedge with minor suggestions.

It can stop the pipeline. Four verdicts:

Verdict	Action
APPROVE	Continue. Output is sound.
FLAG	Continue, but inject warnings for the next stage to address.
REJECT	Re-run this stage with structured feedback. Max 2 retries.
HALT	Stop everything. Present analysis for human decision.

When the Arbiter rejects, it doesn't just say "this is wrong." It provides structured feedback with severity, category, exact location, evidence, and a suggested fix — all injected into the retry prompt so the generating model knows exactly what to address.

Configurable Review Depth

Not every task needs full review. Choose your safety level:

crtx run "..." --arbiter full       # Review every stage (critical features)
crtx run "..." --arbiter bookend    # Review architecture + final output (default)
crtx run "..." --arbiter final      # Review final output only (prototypes)
crtx run "..." --arbiter off        # No review (rapid iteration)

Or in the REPL: arbiter full sets the depth for all subsequent tasks in the session.

Supported Models

CRTX is model-agnostic. Any LLM that supports chat completions works. Add a new model by adding a TOML entry — no code changes required.

Pre-Configured Providers

Provider	Models	Best At
Anthropic	Claude Opus, Sonnet, Haiku	Refactoring, verification, nuanced review
OpenAI	GPT-4o, o3-mini	Fast implementation, broad language support
Google	Gemini 2.5 Pro, Flash	Architecture, large context reasoning
xAI	Grok 4, Grok 3	Independent analysis, alternative perspectives

Adding Models

# config/models.toml
[models.deepseek-v3]
provider = "deepseek"
model = "deepseek-chat"
roles = ["implement", "refactor"]
cost_per_1k_input = 0.0001
cost_per_1k_output = 0.0002

DeepSeek, Llama, Mistral, Ollama (local), vLLM (self-hosted) — if LiteLLM supports it, CRTX supports it.

Presets

Instead of specifying --mode, --route, and --arbiter on every command, use a preset:

Preset	Mode	Route	Arbiter	Use Case
balanced (default)	sequential	hybrid	bookend	Standard development. Best cost/quality balance.
fast	sequential	speed-first	off	Rapid iteration. Cheapest models, no review.
cheap	sequential	cost-optimized	off	Budget-conscious. Lowest cost above fitness threshold.
thorough	sequential	quality-first	full	Maximum quality. Best models, every stage reviewed.
explore	parallel	hybrid	bookend	Fan out to 3+ models, cross-review, synthesize the best.
debate	debate	quality-first	full	Structured debate. Best for architecture decisions and tradeoffs.

crtx run "Build a REST API" --preset explore
crtx run "Build a REST API" --preset fast

# Override any part of a preset
crtx run "Build a REST API" --preset explore --arbiter full

In the REPL:

crtx [balanced] ▸ preset explore
  Mode set to parallel, route hybrid, arbiter bookend

crtx [explore] ▸ preset fast
  Mode set to sequential, route speed-first, arbiter off

No preset flag defaults to balanced. If you manually change mode/route/arbiter after selecting a preset, the prompt shows the current settings instead of a preset name.

Presets

Most users never need to touch --mode, --route, or --arbiter directly. Presets bundle them:

Preset	Mode	Routing	Arbiter	Use Case
balanced (default)	sequential	hybrid	bookend	Standard development. Best cost/quality balance.
fast	sequential	speed-first	off	Rapid iteration. Cheapest models, no review.
cheap	sequential	cost-optimized	off	Budget-conscious. Cheapest models above fitness threshold.
thorough	sequential	quality-first	full	Critical features. Best models, every stage reviewed.
explore	parallel	hybrid	bookend	Fan out to 3+ models, cross-review, synthesize the best.
debate	debate	quality-first	full	Structured debate between models. Architecture decisions.

crtx run "Build a REST API" --preset explore
crtx run "Build a REST API" --preset fast
crtx run "Build a REST API"                    # balanced (default)

Presets are starting points — override any part:

crtx run "Build a REST API" --preset explore --arbiter full

In the REPL:

crtx [balanced] ▸ preset explore
  Mode set to parallel, route hybrid, arbiter bookend

crtx [explore] ▸ preset fast
  Mode set to sequential, route speed-first, arbiter off

Mode	How It Works	Best For
Sequential (default)	Architect → Implement → Refactor → Verify, each building on the last	Standard development, most tasks
Parallel	All models solve independently, cross-review, score, merge best approach	Complex problems with multiple valid solutions
Debate	Position papers → rebuttals → final arguments → judgment	Architectural decisions, tradeoff analysis

crtx run "..." --mode sequential   # Default
crtx run "..." --mode parallel     # Fan-out + consensus
crtx run "..." --mode debate       # Structured debate

Or in the REPL: mode parallel sets the mode for all subsequent tasks in the session.

Smart Routing

CRTX assigns models to pipeline roles based on fitness benchmarks — each model is scored on how well it performs as Architect, Implementer, Refactorer, and Verifier. Four routing strategies let you optimize for what matters:

Strategy	Behavior
quality-first	Best model per role regardless of cost
cost-optimized	Cheapest model above fitness threshold
speed-first	Lowest-latency models preferred
hybrid (default)	Quality for critical stages, cost-optimized for early stages

crtx run "..." --route hybrid          # Default
crtx run "..." --route quality-first   # Max quality
crtx estimate "..." --compare-routes   # Compare costs

Or in the REPL: route quality_first sets the strategy for all subsequent tasks.

Configuration

API Keys

The recommended way to configure API keys is crtx setup, which validates keys and saves them for future sessions. Keys are loaded in this order (highest priority first):

Environment variables — ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY, XAI_API_KEY
~/.crtx/keys.env — User-level keys saved by crtx setup
.env in current directory — Project-level overrides

You only need keys for the providers you want to use. At least one provider must be configured.

# Recommended: interactive setup with validation
crtx setup

# Or set environment variables directly
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...

Pipeline Defaults

Pipeline defaults (mode, routing strategy, arbiter depth, timeout) are configured in config/defaults.toml. These can be overridden per-run via CLI flags or the interactive config screen.

CLI Commands

Command	Description
`crtx`	Launch interactive session (REPL mode)
`crtx setup`	Configure API keys interactively
`crtx setup --check`	Validate existing API keys
`crtx setup --reset`	Clear keys and reconfigure
`crtx run`	Run a full pipeline on a task
`crtx plan`	Expand a rough idea into a structured task spec
`crtx estimate`	Estimate cost before running
`crtx review`	Multi-model PR review (CI/CD integration)
`crtx review-code`	Multi-model review of existing code files
`crtx improve`	Multi-model improvement of existing code
`crtx models list`	Show registered models with fitness scores
`crtx models check`	Verify API key connectivity
`crtx config show`	Display current pipeline configuration
`crtx sessions list`	Browse past pipeline runs
`crtx sessions show`	View full session details
`crtx dashboard`	Launch real-time browser visualization

# Interactive session — persistent state, branded UI
crtx

# Run a task with interactive config screen
crtx run "Add WebSocket support to the existing Express server"

# Run with explicit flags (skips config screen)
crtx run "Add WebSocket support" --mode sequential --route hybrid --arbiter bookend

# Plan first, then run
crtx plan "Build a data processing pipeline" --run

# Review a PR diff
crtx review --diff changes.patch --fail-on critical

# Review existing code with multiple models
crtx review-code src/middleware.py --preset thorough

# Improve existing code
crtx improve src/rate_limiter/ --focus "error handling, type safety"

# Launch the real-time dashboard
crtx dashboard --port 8420

The CLI uses Rich for a premium terminal experience — branded ASCII art, interactive config screens, real-time pipeline status with stage-by-stage progress, color-coded Arbiter verdicts, and a post-completion summary with export actions.

Review & Improve Existing Code

CRTX doesn't just generate code — it can review and improve code you've already written.

Multi-Model Review

Have 3+ models independently review your code, cross-check each other's findings, and produce a ranked report:

crtx review-code src/middleware.py
crtx review-code src/rate_limiter/ --preset thorough

Each model finds bugs, security issues, and design problems independently. Then they review each other's findings — agreeing, disagreeing, and catching what others missed. Issues found by multiple models rank highest. Single-source findings are flagged as lower confidence.

Multi-Model Improve

Have 3+ models each produce an improved version of your code, vote on the best, and synthesize:

crtx improve src/middleware.py
crtx improve src/rate_limiter/ --focus "error handling, type safety"

Like parallel mode, but starting from your existing code instead of a task description. The Arbiter reviews the final improvement against your original. You see a diff before anything is written.

CRTX supports domain-specific verification rules that the Arbiter checks in addition to general code quality:

# config/domain/my_rules.toml
[rules.schema_consistency]
description = "All database models must use integer primary keys"
severity = "critical"
pattern = "UUIDField|uuid4"
action = "reject"

[rules.test_coverage]
description = "Every new service must have corresponding test file"
severity = "warning"

We use CRTX to build a financial services platform — our custom rules enforce schema patterns, threading conventions, and audit trail requirements specific to our domain. You can do the same for yours.

How We Use It

We built CRTX because we needed it. Our team uses CRTX as the primary development workflow for a financial services operating system with 2,900+ tests. Every new feature, every module, every refactor goes through the pipeline. The Arbiter has caught schema mismatches, hallucinated dependencies, over-engineered abstractions, and integration failures — all before code review.

CRTX isn't a research project. It's a production tool that we bet our own codebase on every day.

Cost

CRTX adds model calls, which cost tokens. Here's what a typical task looks like:

Configuration	Est. Cost per Task	Use Case
No Arbiter	~$4.30	Rapid iteration
Final Only	~$5.10	Prototyping
Bookend (default)	~$5.80	Standard development
Full Arbiter	~$7.30	Critical features

At the default Bookend depth and ~15 tasks/week, the Arbiter adds about $90/month. One production bug it catches pays for a year of reviews.

Documentation

Document	Description
Architecture	Core pipeline design, consensus protocol, technology stack
Model-Agnostic System	Plugin architecture, LiteLLM adapter, dynamic role assignment
Arbiter Layer	Independent review system, verdicts, feedback injection
Build Spec	MVP scope, day-by-day build plan, technical decisions

Architecture

triad/
├── cli.py                  # Typer + Rich terminal interface
├── cli_display.py          # Branded UI: logos, config screen, live display, summary
├── repl.py                 # Interactive REPL with session state
├── orchestrator.py         # Pipeline engine (sequential, parallel, debate)
├── planner.py              # Task planner (crtx plan)
├── providers/              # LiteLLM adapter + model registry
├── routing/                # Fitness-based model-to-role assignment
├── arbiter/                # Independent adversarial review engine
├── consensus/              # Cross-domain suggestions + voting protocol
├── context/                # AST-aware codebase scanner + context builder
├── persistence/            # SQLite session storage + export
├── ci/                     # Multi-model PR review for CI/CD
├── dashboard/              # Real-time WebSocket visualization (optional)
├── schemas/                # Pydantic v2 models (all data contracts)
├── prompts/                # Jinja2 role prompt templates
├── output/                 # File writer + Markdown report renderer
└── config/                 # TOML configuration (models, defaults, routing)

Contributing

We welcome contributions! Please read CONTRIBUTING.md before submitting a PR.

Important: All contributors must sign our Contributor License Agreement before their first PR can be merged. This is handled automatically via CLA Assistant — you'll be prompted when you open your first PR.

License

Apache 2.0 — see LICENSE for details.

_{Built by TriadAI — Every session smarter than the last.}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Feb 21, 2026

0.2.0

Feb 21, 2026

0.1.1

Feb 20, 2026

This version

0.1.0

Feb 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crtx-0.1.0.tar.gz (682.0 kB view details)

Uploaded Feb 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crtx-0.1.0-py3-none-any.whl (232.7 kB view details)

Uploaded Feb 18, 2026 Python 3

File details

Details for the file crtx-0.1.0.tar.gz.

File metadata

Download URL: crtx-0.1.0.tar.gz
Upload date: Feb 18, 2026
Size: 682.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for crtx-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`84a050de4cdc77580e35afe4db76ff65bc6c0f98cb6ebdb684185393d36ff1f6`
MD5	`c67dc96e22937a359cb11cb6b7f86fd0`
BLAKE2b-256	`b5f3debf3af69da25cd6573b716e6a35c0894accd69f88ee8955e5ab6d484647`

See more details on using hashes here.

File details

Details for the file crtx-0.1.0-py3-none-any.whl.

File metadata

Download URL: crtx-0.1.0-py3-none-any.whl
Upload date: Feb 18, 2026
Size: 232.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for crtx-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a5c6882f171833cea54236d10956b3683690acc7b765983326dff704ff998b03`
MD5	`66778ca164b062130e4669902d3e2f7d`
BLAKE2b-256	`b92e65cc9c01d9b87dea484d07bc5055316f0af7357b2176dda597a44f8d2e2c`

See more details on using hashes here.

crtx 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

The Problem

The Fix

Quick Start

Getting Started

First-Time Setup

Interactive Session (REPL)

Direct Execution

How It Works

The Arbiter

Configurable Review Depth

Supported Models

Pre-Configured Providers

Adding Models

Presets

Presets

Smart Routing

Configuration

API Keys

Pipeline Defaults

CLI Commands

Review & Improve Existing Code

Multi-Model Review

Multi-Model Improve

How We Use It

Cost

Documentation

Architecture

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes