Compress LLM output safely. Save tokens without breaking your code.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Yash-Koladiya30

These details have not been verified by PyPI

Project description

Brevix

Compress LLM output safely. Save tokens without breaking your code.

Install · Usage · Cost Routing · Accuracy Guard · Benchmarks · Roadmap · Contributing

Cuts response tokens 40–75% with a deterministic rule engine. Routes each task to the cheapest capable Claude tier — saving up to ~80% of API spend. Verifies meaning is preserved before emit. Works across 20+ AI coding tools.

40-75% savings ~80% API cost cut Accuracy Guard 20+ Platforms 100% Local

Why Brevix

Brevix is two cost-saving layers in one tool:

Output compression — cuts response tokens 40–75% with a rule engine that's verified by a local Accuracy Guard. No silent meaning loss on dense technical prose.
Cost management & smart model routing — picks the cheapest Claude tier (Haiku / Sonnet / Opus) per task, escalates only on low-confidence answers, and enforces token + USD budget caps. Typical savings: ~80% of API spend vs always running Opus.

Works with Claude Code · Cursor · Windsurf · OpenAI Codex CLI · Google Antigravity · Gemini CLI · GitHub Copilot Chat · Aider · Continue.dev · Cline · Roo Code · Zed AI · Augment · Kilo · OpenHands · Tabnine · Warp · Replit · Sourcegraph Amp — plus any tool reading AGENTS.md.

💰 Cost Management & Model Routing

What it does: Stops you from paying Opus prices for tasks Haiku can solve in a single shot.

The problem

Most LLM coding tools call the most expensive model for every prompt. A task like "classify this support ticket" hits Opus at $0.05 per call, when Haiku could answer it for $0.0002 — 250× cheaper, identical answer.

The fix

Brevix Route is a thin layer in front of the Anthropic SDK that:

Classifies the task (regex/keyword-based, ~6μs, no API call).
Picks the cheapest capable model from a task → model rule table you control.
Optionally scores the response for confidence (hedge phrases, validity, length, optional semantic check).
Escalates to the next tier (Haiku → Sonnet → Opus) only when confidence is below threshold.
Records every call to a local JSONL log so you can audit exactly what was saved.

How to use it

Option A — From your code (Python)

from brevix import RoutedClient

client = RoutedClient(log_enabled=True)

# Cheap task -> auto-routed to Haiku.
r = client.call("Classify this ticket: app crashes on login")
print(r.text, r.model, r.cost_usd)
# > "bug"  claude-haiku-4-5  0.000028

# Hard task -> auto-routed to Opus.
r = client.call("Architect a multi-region payment system with sub-100ms p99 latency")
print(r.model)
# > claude-opus-4-7

# Confidence-guarded mode: retries on next tier only if the first answer hedges.
r = client.call("Review this auth middleware for token expiry bugs", confidence_check=True)
print(r.model, r.escalations, r.confidence)
# > claude-opus-4-7  1  0.94

Option B — From the CLI

brevix route --init                                    # one-time: write default config
brevix route "classify ticket: app crashes" --explain  # dry-run, show decision + cost
brevix route "..." --call                              # actually call the model
brevix route "..." --call --confidence                 # call + escalate on low conf
brevix route --budget-show                             # current spend
brevix stats --routing --since 7d                      # weekly cost-saved report
brevix route --learn-suggest                           # recommend rule changes
brevix route --learn-apply                             # apply them

Option C — From Claude Code (slash commands)

After /plugin install brevix@brevix:

/brevix-route classify this support ticket: "app crashes on login"
        ↓
[brevix] task=classify tier=haiku escalated=0
bug

/brevix-route architect a multi-region payment system with sub-100ms p99 latency
        ↓
[brevix] task=architecture tier=opus escalated=0
<full Opus response>

/brevix-route-stats --since 7d         # cost saved vs Opus-only baseline
/brevix-learn                          # suggested rule changes
/brevix-learn apply                    # apply them

Hard budget cap

Edit ~/.brevix/route.json:

{
  "budget": { "tokens": 0, "cost_usd": 50.00 }
}

When the cap is hit, the next call raises BudgetExceededError instead of silently overspending.

Or per-run from the CLI:

brevix route --budget-cost 5.00 --call "your prompt"

Real-world savings

Workload	Naive (Opus only)	Brevix routed	Saved
Solo dev, 50 calls/day, mostly easy tasks	$0.40/day	$0.06/day	85%
Team of 5, 500 calls/day, mixed	$4.00/day	$0.80/day	80%
Customer-support bot, 10k tickets/day, 90% classify	$80/day	$1.50/day	98%

Same answer quality — confidence guard ensures hard cases still hit Opus.

See the full Routing tutorial in ## ⚡ Usage for advanced flags, force-tier overrides, and the auto-tuning learn loop.

Features

Compression Engine

Three modes — Lite · Full · Ultra
Adaptive Auto mode — picks safest aggressive level per response
Protected regions — code blocks, URLs, error quotes never touched
File compression — CLAUDE.md, AGENTS.md, project notes with .original backup

Safety & Verification

Accuracy Guard — semantic similarity check before emit
Strict mode — auto-fallback to original when meaning would be lost
Local-first — no telemetry, no API calls in the engine
Reproducible benchmarks — three-arm A/B eval harness with tiktoken o200k_base

Integrations

MCP middleware (brevix-shrink) — compresses tool/prompt/resource descriptions
20 platform install targets — idempotent BREVIX:BEGIN/END markers
Statusline badge — [BREVIX] ⛏ X.Xk saved
Subagents — investigator · builder · reviewer with terse output

Insights

Real session-log token counts — stats --real --since 7d
Shareable savings — stats --share produces tweet-ready output
Per-mode breakdown — see exactly what each level saves
Free + MIT licensed

💰 Cost Management & Model Routing — new

Task-aware tiering — classify/parse → Haiku, code review/debug → Sonnet, architecture → Opus
Confidence-driven escalation — hedge / validity / length scorers retry on the next tier only when needed
Hard budget caps — BudgetExceededError blocks runaway spend (per-token or per-USD limits)
Routing dashboard — brevix stats --routing shows cost saved vs Opus-only baseline, escalation rate, per-model + per-task breakdown
Self-tuning — brevix route --learn-suggest|--learn-apply patches your config from observed escalation patterns
Claude Code subagents — brevix-haiku · brevix-sonnet · brevix-opus plus /brevix-route slash dispatcher

🚀 Install

One-liner — macOS / Linux / WSL

curl -fsSL https://raw.githubusercontent.com/Yash-Koladiya30/brevix/main/install.sh | bash -s -- --all

Windows — PowerShell

irm https://raw.githubusercontent.com/Yash-Koladiya30/brevix/main/install.ps1 | iex

`skills` CLI — one command, 9 tools at once

Auto-installs Brevix skills into Antigravity, Claude Code, Cline, Codex, Cursor, Gemini CLI, GitHub Copilot, Kiro CLI, and Qoder simultaneously.

npx skills add https://github.com/Yash-Koladiya30/brevix

Pick a specific skill:

npx skills add https://github.com/Yash-Koladiya30/brevix --skill brevix
npx skills add https://github.com/Yash-Koladiya30/brevix --skill brevix-commit
npx skills add https://github.com/Yash-Koladiya30/brevix --skill brevix-stats

Listing → skills.sh/Yash-Koladiya30/brevix

Manual install

pip install brevix                  # core
pip install 'brevix[guard]'         # + semantic Accuracy Guard
pip install 'brevix[tokens]'        # + accurate tiktoken counts
pip install 'brevix[all]'           # everything

Plug into your LLM coding tool

brevix install --list                # show all 20 targets
brevix install claude-code           # Claude Code plugin layout
brevix install cursor                # .cursor/rules/brevix.mdc
brevix install codex                 # AGENTS.md + .codex/hooks.json
brevix install gemini                # gemini-extension.json + GEMINI.md
brevix install all                   # write rule files for every tool

Idempotent — re-running updates the Brevix block, leaves your other content alone.

Claude Code marketplace

/plugin marketplace add Yash-Koladiya30/brevix
/plugin install brevix@brevix

MCP middleware

Compress upstream MCP server descriptions:

npm install -g brevix-shrink

Wrap any MCP server in your Claude config:

{
  "mcpServers": {
    "fs-shrunk": {
      "command": "npx",
      "args": ["brevix-shrink", "npx", "-y",
               "@modelcontextprotocol/server-filesystem", "/tmp"]
    }
  }
}

⚡ Usage

Slash commands — Claude Code, Cursor, etc.

# Output compression
/brevix                  # toggle on (full mode)
/brevix lite             # gentle compression
/brevix ultra            # max compression
/brevix auto             # pick best mode per response
/brevix off              # disable
/brevix-commit           # terse Conventional Commit message
/brevix-check            # run Accuracy Guard on a snippet
/brevix-stats            # show compression savings

# Smart model routing (saves API cost)
/brevix-route <task>     # route to cheapest capable tier (Haiku/Sonnet/Opus)
/brevix-route-stats      # cost saved vs Opus-only baseline
/brevix-learn            # suggested rule changes from observed escalations
/brevix-learn apply      # apply suggestions to ~/.brevix/route.json

For Codex CLI (no slash commands), use $brevix lite|full|ultra|auto|off.

CLI

# Output compression
brevix compress "Your verbose text here" --mode full
brevix compress -                      # stdin
brevix compress . --mode auto -v       # adaptive picks best
brevix compress . --guard --strict --threshold 0.85

# File compression (CLAUDE.md, AGENTS.md, project notes)
brevix compress-file CLAUDE.md         # writes .original.md backup
brevix compress-file CLAUDE.md --dry-run

# Stats
brevix stats                           # estimated, in-process
brevix stats --real --since 7d         # parsed from Claude Code session logs
brevix stats --share                   # tweet-ready one-liner
brevix stats --reset

# Verification
brevix check "original" "compressed"
brevix count "how many tokens?"

# Install rules into a project
brevix install cursor
brevix install --list

# Smart model routing
brevix route --init                                # write default config to ~/.brevix/route.json
brevix route "classify ticket: app crashes"        # print suggested model
brevix route "..." --explain                       # task, model, est cost, reason
brevix route "..." --call                          # actually call the chosen model
brevix route "..." --call --confidence             # also escalate on low-confidence answers
brevix route --budget-show                         # current spend vs cap
brevix route --budget-tokens 1000000 --budget-cost 50.00 "..." --call
brevix stats --routing                             # cost saved, escalation rate, by model/task
brevix stats --routing --since 7d
brevix route --learn-suggest                       # recommend rule changes
brevix route --learn-apply                         # apply them

Subagents — Claude Code

agents/ ships six focused subagents that emit ~60% smaller tool results than vanilla agents:

Agent	Purpose	Output format
brevix-investigator	Read-only code locator	`path:line — symbol — note`
brevix-builder	Surgical 1–2 file edits with verification	Diff + verify status
brevix-reviewer	Bug-focused diff review	`path:line: 🔴 bug: …. fix.`
brevix-haiku	Cheap tier — classify, parse, format, rename	Brief, exact answer
brevix-sonnet	Mid tier — review, refactor, debug, explain	Direct, file:line cited
brevix-opus	Heavy tier — architecture, multi-agent, hard bugs	Decision + reasoning chain

Smart Model Routing — `/brevix-route`

Save ~80% on Claude API cost without manually picking a model per task.

Quick start (5 steps, ~1 minute)

1. Install Brevix and the Anthropic SDK

pip install brevix anthropic
export ANTHROPIC_API_KEY=sk-ant-...

2. Install the Claude Code plugin

/plugin marketplace add Yash-Koladiya30/brevix
/plugin install brevix@brevix

This adds the routing-tier subagents (brevix-haiku / brevix-sonnet / brevix-opus) and the /brevix-route, /brevix-route-stats, /brevix-learn slash commands.

3. Initialize the routing config

brevix route --init

Writes ~/.brevix/route.json with sensible defaults: classify→Haiku, code review→Sonnet, architecture→Opus, escalation chain Haiku→Sonnet→Opus, no budget cap.

4. Use it from Claude Code

/brevix-route classify this support ticket: "app crashes on login"
        ↓
[brevix] task=classify tier=haiku escalated=0
bug

/brevix-route architect a multi-region payment system with sub-100ms p99 latency
        ↓
[brevix] task=architecture tier=opus escalated=0
<full Opus response>

5. Check what you saved

/brevix-route-stats --since 7d

How it works under the hood

/brevix-route <task> runs brevix route ... --explain under the hood.
Reads the suggested model from CLI output.
Spawns the matching subagent (brevix-haiku / brevix-sonnet / brevix-opus) via the Task tool.
If the subagent returns escalate: <reason>, retries on the next tier (capped at 1 retry per tier).
Records the call to ~/.brevix/routing_log.jsonl for stats.

Force a tier when you already know the right one:

/brevix-route --force-tier=sonnet refactor this 200-line module

Track your savings — `/brevix-route-stats`

/brevix-route-stats             # all-time
/brevix-route-stats --since 7d  # last week

Sample output:

Brevix Routing Stats
--------------------
Window:             7d
Calls:              412
Total cost:         $1.84
Opus-only baseline: $9.67
Saved:              $7.83 (80.9%)
Escalations:        18 (4.4% of calls)

By model:
  claude-haiku-4-5    289 (70.1%)  $0.08
  claude-sonnet-4-6   108 (26.2%)  $0.65
  claude-opus-4-7      15 ( 3.7%)  $1.11

Auto-tune from your usage — `/brevix-learn`

After a few days, Brevix recommends rule changes from observed escalation patterns:

/brevix-learn

  refactor: claude-sonnet-4-6 -> claude-opus-4-7
    samples=54  escalation_rate=68.5%
    reason: 37/54 escalated (69%); 32 of those landed on claude-opus-4-7

Apply with /brevix-learn apply to update ~/.brevix/route.json. User customizations and budget caps are preserved.

Hard budget cap

Edit ~/.brevix/route.json:

{
  "budget": { "tokens": 0, "cost_usd": 50.00 }
}

Or per-run from the CLI:

brevix route --budget-cost 5.00 --call "your prompt"

When the cap is hit, the next call raises BudgetExceededError instead of silently overspending.

🛡 How Accuracy Guard works

Compress output via the rule engine.
Score the original vs compressed text with a local sentence-transformer (no API cost).
If similarity ≥ threshold (default 0.85) → emit compressed. Otherwise warn, or in --strict mode fall back to original.
Without sentence-transformers installed → falls back to content-word containment (drops stopwords without penalty, fair to compression).

Result: compression you can trust on production code, specs, and contracts.

💡 Compression example

Before

The reason your React component is re-rendering on every parent update is that you are passing an inline object as a prop. In JavaScript, every render creates a new object reference, even if the contents are identical. To fix this, wrap the object in useMemo so the reference stays stable across renders.

After (full mode)

Inline object prop = new ref each render = re-render. Wrap in useMemo.

Tokens saved: ~75% · Meaning preserved: ✅ similarity 0.91

📊 Benchmarks

Reproducible three-arm A/B harness in evals/. Compares no-system-prompt vs "be terse" control vs Brevix on 10 developer prompts.

arm	n	median	mean	total	vs baseline	vs control
baseline	10	221	247.3	2473	—	—
control	10	178	191.6	1916	22.5%	—
brevix	10	119	128.4	1284	48.1%	33.0%

Run yourself:

pip install 'brevix[all]' anthropic
export ANTHROPIC_API_KEY=...
python evals/llm_run.py --model claude-sonnet-4-6
python evals/measure.py

The vs control column is the honest savings — what Brevix adds beyond "just be brief."

🗺 Roadmap

Core compression engine (lite / full / ultra)
Adaptive (auto) mode
Accuracy Guard (semantic + content-word fallback)
Local stats counter
Multi-platform installer (20 targets)
File-level compression (brevix compress-file)
MCP middleware (brevix-shrink)
Statusline badge + Claude Code hooks
Subagents (investigator / builder / reviewer)
Three-arm eval harness
PowerShell installer + uninstaller
Model routing engine (Haiku / Sonnet / Opus tiering)
Confidence-driven escalation (hedge / validity / length scorers)
Token + cost budget enforcement (BudgetExceededError)
Routing stats dashboard (brevix stats --routing)
Learn loop (brevix route --learn-suggest|--learn-apply)
Claude Code routing tier subagents + slash commands
VSCode extension UI
Browser extension (claude.ai, chatgpt.com web)
Two-way compression (compress prompts before send)
Custom user-defined rule packs
Web dashboard (team tier)

📜 License & Contributing

MIT — free for personal and commercial use. Issues and PRs welcome — see docs/CONTRIBUTING.md.

_{Built with care by Yash Koladiya · If Brevix saves you tokens, ⭐ the repo}

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Yash-Koladiya30

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.4.2

May 6, 2026

0.4.1

May 5, 2026

0.4.0

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

brevix-0.4.2.tar.gz (60.9 kB view details)

Uploaded May 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

brevix-0.4.2-py3-none-any.whl (48.3 kB view details)

Uploaded May 6, 2026 Python 3

File details

Details for the file brevix-0.4.2.tar.gz.

File metadata

Download URL: brevix-0.4.2.tar.gz
Upload date: May 6, 2026
Size: 60.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for brevix-0.4.2.tar.gz
Algorithm	Hash digest
SHA256	`915077ef27d6d8ef674693449f71c4a7a1648b86e2477549489169bf74882fd5`
MD5	`d75ea3136534b3cb738613f41bfaea2a`
BLAKE2b-256	`eeef8f2aaf40241e7f1d4e17743e1ccff88a2aec2669f2b26a341103edaac4d1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for brevix-0.4.2.tar.gz:

Publisher: pypi.yml on Yash-Koladiya30/brevix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: brevix-0.4.2.tar.gz
- Subject digest: 915077ef27d6d8ef674693449f71c4a7a1648b86e2477549489169bf74882fd5
- Sigstore transparency entry: 1449841297
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Yash-Koladiya30/brevix@326dfe078aa23beae3db39818de1e0ee7f16da58
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/Yash-Koladiya30
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@326dfe078aa23beae3db39818de1e0ee7f16da58
- Trigger Event: push

File details

Details for the file brevix-0.4.2-py3-none-any.whl.

File metadata

Download URL: brevix-0.4.2-py3-none-any.whl
Upload date: May 6, 2026
Size: 48.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for brevix-0.4.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9e42f65c95a5a7a9b839626656d27ee70901d20909dc0c8ac27b46a701749abd`
MD5	`927f9bf988c9b6cd00836bd72ec72a9b`
BLAKE2b-256	`3588b6e1b1c92dcfb4f6d25413a9a084124b38f256c76bb73e2b14c74e37e1f3`

See more details on using hashes here.

Provenance

The following attestation bundles were made for brevix-0.4.2-py3-none-any.whl:

Publisher: pypi.yml on Yash-Koladiya30/brevix

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: brevix-0.4.2-py3-none-any.whl
- Subject digest: 9e42f65c95a5a7a9b839626656d27ee70901d20909dc0c8ac27b46a701749abd
- Sigstore transparency entry: 1449841316
- Sigstore integration time: May 6, 2026
Source repository:
- Permalink: Yash-Koladiya30/brevix@326dfe078aa23beae3db39818de1e0ee7f16da58
- Branch / Tag: refs/tags/v0.4.2
- Owner: https://github.com/Yash-Koladiya30
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi.yml@326dfe078aa23beae3db39818de1e0ee7f16da58
- Trigger Event: push

brevix 0.4.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Brevix

Compress LLM output safely. Save tokens without breaking your code.

Why Brevix

💰 Cost Management & Model Routing

The problem

The fix

How to use it

Option A — From your code (Python)

Option B — From the CLI

Option C — From Claude Code (slash commands)

Hard budget cap

Real-world savings

Features

Compression Engine

Safety & Verification

Integrations

Insights

💰 Cost Management & Model Routing — new

🚀 Install

One-liner — macOS / Linux / WSL

Windows — PowerShell

skills CLI — one command, 9 tools at once

Manual install

Plug into your LLM coding tool

Claude Code marketplace

MCP middleware

⚡ Usage

Slash commands — Claude Code, Cursor, etc.

CLI

Subagents — Claude Code

Smart Model Routing — /brevix-route

Quick start (5 steps, ~1 minute)

How it works under the hood

Track your savings — /brevix-route-stats

Auto-tune from your usage — /brevix-learn

Hard budget cap

🛡 How Accuracy Guard works

💡 Compression example

📊 Benchmarks

🗺 Roadmap

📜 License & Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`skills` CLI — one command, 9 tools at once

Smart Model Routing — `/brevix-route`

Track your savings — `/brevix-route-stats`

Auto-tune from your usage — `/brevix-learn`