crtx · PyPI

Multi-model AI orchestration platform. Plugin any LLM. Ship better code.

These details have not been verified by PyPI

Project links

Project description

CRTX

Generate. Test. Fix. Review. One command, verified output.

Quick Start • The Problem • The Loop • Benchmarks • How It Works • Commands • Supported Models

python 3.12+ license Apache 2.0 PyPI version

What is CRTX?

CRTX is an AI development intelligence tool that generates, tests, fixes, and reviews code automatically. One command in, verified output out.

It works with any model — Claude, GPT, Gemini, Grok, DeepSeek — and picks the right one for each task. You don't configure pipelines or choose models. You describe what you want and CRTX handles the rest.

crtx loop "Build a REST API with FastAPI, SQLite, search and pagination"

The Problem

Single AI models generate code that looks correct but often has failing tests, broken imports, and missed edge cases. Developers spend 10–30 minutes per generation debugging and fixing AI output before it actually works.

Multi-model pipelines cost 10–15x more without meaningfully improving quality. Four models reviewing each other's prose doesn't catch a broken import statement.

The issue isn't the model. It's the lack of verification. Nobody runs the code before handing it to you.

The Loop

CRTX solves this with the Loop: Generate → Test → Fix → Review.

Generate — The best model for the task writes the code
Test — CRTX runs the code locally: AST parse, import check, pyflakes, pytest, entry point execution
Fix — Failures feed back to the model with structured error context for targeted fixes
Review — An independent Arbiter (always a different model) reviews the final output

Every output is tested before you see it. If tests fail, CRTX fixes them. If the fix cycle stalls, three escalation tiers activate before giving up. If the Arbiter rejects the code, one more fix cycle runs.

The result: code that passes its own tests, has been reviewed by a second model, and comes with a verification report.

Benchmarks

Same 12 prompts, same scoring rubric. CRTX Loop vs. single models vs. multi-model debate:

Condition	Avg Score	Min	Spread	Avg Dev Time	Cost
Single Sonnet	94%	92%	4 pts	10 min	$0.36
Single o3	81%	54%	41 pts	4 min	$0.44
Multi-model Debate	88%	75%	25 pts	9 min	$5.59
CRTX Loop	99%	98%	2 pts	2 min	$1.80

Dev Time = estimated developer minutes to get the output to production (based on test failures, import errors, and entry point issues). Spread = max score minus min score across all prompts.

The Loop scores higher, more consistently, with less post-generation work than any other condition — at a fraction of the cost of multi-model pipelines.

Run the benchmark yourself:

crtx benchmark --quick

How It Works

  ┌─────────┐    ┌──────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐    ┌─────────┐
  │  Route  │ ─→ │ Generate │ ─→ │  Test   │ ─→ │   Fix   │ ─→ │ Review  │ ─→ │ Present │
  └─────────┘    └──────────┘    └─────────┘    └─────────┘    └─────────┘    └─────────┘
       │                              │              │
       │                              └──────────────┘
       │                               ↑ loop until pass
       │
       ├── simple  → fast model, 2 fix iterations
       ├── medium  → balanced model, 3 fix iterations
       └── complex → best model, 5 fix iterations + architecture debate

Route — Classifies your prompt by complexity (simple/medium/complex) and selects the model, fix budget, and timeout tier.

Generate — Produces source files and test files. If no tests are generated, a second call creates comprehensive pytest tests so the fix cycle always has something to verify against.

Test — Five-stage local quality gate: AST parse → import check → pyflakes → pytest → entry point execution. Per-file pytest fallback on collection failures.

Fix — Feeds structured test failures back to the model for targeted fixes. Detects phantom API references (tests importing functions that don't exist in source) and pytest collection failures.

Three-tier gap closing — When the normal fix cycle can't resolve failures:

Tier 1 — Diagnose then fix: "analyze the root cause without writing code," then feed the diagnosis back for a targeted fix
Tier 2 — Minimal context retry: strip context to only the failing test and its source file, fresh perspective
Tier 3 — Second opinion: escalate to a different model with the primary model's diagnosis

Review — An independent Arbiter (always a different model than the generator) reviews for logic errors, security issues, and design problems. On REJECT, triggers one more fix cycle and retests.

Present — Final results with verification report, file list, and cost breakdown.

Key Features

Smart routing — Classifies prompts by complexity and picks the right model, fix budget, and timeout for each task. Simple tasks get fast models. Complex tasks get the best model plus an architecture debate.

Three-tier gap closing — When fixes stall, CRTX escalates: root cause diagnosis, minimal context retry, then a second opinion from a different model. Most stuck cases resolve at tier 1 or 2.

Independent Arbiter review — Every run gets reviewed by a model that didn't write the code. Cross-model review catches errors that self-review misses. Skip with --no-arbiter.

Verified scoring — Every output is tested locally before you see it. The verification report shows exactly which checks passed, how many tests ran, and estimated developer time to production.

Auto-fallback — If a provider goes down mid-run (rate limit, timeout, outage), CRTX substitutes the next best model and keeps going. A 5-minute cooldown prevents hammering a struggling provider.

Apply mode — Write generated code directly to your project with --apply. Interactive diff preview, git branch protection, conflict detection, AST-aware patching, and automatic rollback if post-apply tests fail.

Context injection — Scan your project and inject relevant code into the generation prompt with --context .. AST-aware Python analysis extracts class signatures, function definitions, and import graphs within a configurable token budget.

Quick Start

pip install crtx
crtx setup        # configure your API keys

Then run:

crtx loop "Build a CLI password generator with strength validation and clipboard support"

Commands

Command	What it does
`crtx loop "task"`	Generate, test, fix, and review code (default)
`crtx run "task"`	Run a multi-model pipeline (sequential/parallel/debate)
`crtx benchmark`	Run the built-in benchmark suite
`crtx repl`	Interactive shell with session history
`crtx review-code`	Multi-model code review on files or git diffs
`crtx improve`	Review → improve pipeline with cross-model consensus
`crtx setup`	API key configuration
`crtx models`	List available models with fitness scores
`crtx estimate "task"`	Cost estimate before running
`crtx sessions`	Browse past runs
`crtx replay <id>`	Re-display a previous session
`crtx dashboard`	Real-time web dashboard

Supported Models

CRTX works with any model supported by LiteLLM — that's 100+ providers. Out of the box, it's configured for:

Provider	Models
Anthropic	Claude Opus 4, Sonnet 4
OpenAI	GPT-4o, o3
Google	Gemini 2.5 Pro, Flash
xAI	Grok
DeepSeek	DeepSeek R1

Add any LiteLLM-compatible model in ~/.crtx/config.toml.

API Key Setup

Run crtx setup to configure your keys interactively, or set them as environment variables:

export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=...
export XAI_API_KEY=xai-...
export DEEPSEEK_API_KEY=sk-...

CRTX only needs one provider to work. More providers means more model diversity for routing and Arbiter review.

Contributing

Contributions are welcome. Fork the repo, create a branch, and submit a PR.

The test suite has 1,096 tests — run them with pytest. Linting is ruff check ..

License

Apache 2.0. See LICENSE for details.

Built by TriadAI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.2.1

Feb 21, 2026

0.2.0

Feb 21, 2026

0.1.1

Feb 20, 2026

0.1.0

Feb 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crtx-0.2.1.tar.gz (1.1 MB view details)

Uploaded Feb 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crtx-0.2.1-py3-none-any.whl (274.0 kB view details)

Uploaded Feb 21, 2026 Python 3

File details

Details for the file crtx-0.2.1.tar.gz.

File metadata

Download URL: crtx-0.2.1.tar.gz
Upload date: Feb 21, 2026
Size: 1.1 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for crtx-0.2.1.tar.gz
Algorithm	Hash digest
SHA256	`a131eac9a2f94f3d2650e2e8d4bb6c2b2d796c4601e2ad8c6b115378afe713b7`
MD5	`34179d76790c928e1302852caac7f00d`
BLAKE2b-256	`1392a5483e15994469c311c66519862fbc1d64ee10985eabc86031f8abf9bd30`

See more details on using hashes here.

File details

Details for the file crtx-0.2.1-py3-none-any.whl.

File metadata

Download URL: crtx-0.2.1-py3-none-any.whl
Upload date: Feb 21, 2026
Size: 274.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.0

File hashes

Hashes for crtx-0.2.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2c22a218e0222afc1cda6b8eb66d9d599c0177d96eafc2e51ea6b908e0992130`
MD5	`b3b35f82954d62887cb60742fc19c54f`
BLAKE2b-256	`1cf6cf196e7311c7ae4c72ec4f188a0a6b6d33d8e206a3604d813a1559f9311f`

See more details on using hashes here.

crtx 0.2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

What is CRTX?

The Problem

The Loop

Benchmarks

How It Works

Key Features

Quick Start

Commands

Supported Models

API Key Setup

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes