Local proxy that optimizes LLM context by 55-97%. Same quality, fraction of the cost.

These details have not been verified by PyPI

Project links

Project description

Brevia

Save 55-93% on LLM tokens. Same quality, fraction of the cost.

Brevia is a local proxy that sits between your tools and the Anthropic API. It algorithmically optimizes context before it reaches Claude — cutting tokens by 55-97% on large contexts while maintaining full recall and producing more precise answers.

Works with Claude Code, Cursor, Continue, aider, and any tool that uses the Anthropic SDK.

Quick Start

pip install brevia
brevia login          # Opens browser — sign in with GitHub or Google
brevia serve          # Starts proxy on localhost:8420

Then add to your shell profile (~/.zshrc, ~/.bashrc):

export ANTHROPIC_BASE_URL=http://localhost:8420

That's it. Everything works exactly as before — but cheaper and often smarter.

How It Works

Your Tool (Claude Code, Cursor, etc.)
    │
    │  ANTHROPIC_BASE_URL=http://localhost:8420
    ▼
┌─────────────────────────────────┐
│         Brevia Proxy            │  ← Runs locally
│                                 │     Zero LLM cost
│  1. Analyze context structure   │     ~5ms latency
│  2. Score relevance by query    │
│  3. Extract key sections        │
│  4. Inject liberation prompt    │
└─────────────────────────────────┘
    │
    │  Optimized payload (55-97% smaller)
    ▼
┌─────────────────────────────────┐
│      api.anthropic.com          │  ← Your API key
│                                 │     Your account
│  Claude processes focused       │     You pay less
│  context = better answers       │
└─────────────────────────────────┘
    │
    │  Response streams back
    ▼
Your Tool (unchanged behavior)

Key insight: Less noise = better answers. When Claude sees 2.4k tokens of the RIGHT code instead of 95k tokens of everything, it produces more precise diagnoses.

Benchmarks

Tested against Claude Opus 4.6/4.7 on real codebases (Django, FastAPI, psf/requests):

Context Size	Cost Savings	Quality Impact
< 2k tokens	0% (passthrough)	None
10k tokens	~55%	None
50k tokens	76-93%	None
95k tokens	76%	Improved (more precise)

Real Code Analysis (4-Path Comparison)

Path	Total Cost	vs Direct
Direct Opus (full context)	$0.836	baseline
Brevia + Opus	$0.278	67% cheaper

Structured Data (50k token billing report)

Metric	Value
Token reduction	97.4%
Cost savings	93.3%
Recall	1.0/1.0 (perfect)

Full benchmark methodology and raw data: benchmarks/BENCHMARKS.md

Commands

Command	Description
`brevia login`	Authenticate (opens browser)
`brevia serve`	Start the proxy
`brevia serve -p 9000`	Start on custom port
`brevia stats`	Show your savings stats
`brevia stats -d 30`	Show last 30 days
`brevia logout`	Remove credentials

What You'll See

When brevia serve is running:

╭─ 🏛️  Brevia ──────────────────────────────────╮
│ Brevia is running                              │
│                                                │
│   Proxy:    http://127.0.0.1:8420              │
│   User:     @yourname                          │
│   Status:   Optimizing all Anthropic API calls │
│                                                │
│   Set this in your shell:                      │
│   export ANTHROPIC_BASE_URL=http://127.0.0.1:8420 │
╰────────────────────────────────────────────────╯

Run brevia stats anytime:

╭─ 📊 Brevia Stats ─────────────────────────────╮
│ All-time savings                               │
│                                                │
│   Days active:    12                           │
│   Total requests: 847                          │
│   Tokens saved:   4,230,000                    │
│   Avg reduction:  71%                          │
│   Est. $ saved:   $63.45                       │
╰────────────────────────────────────────────────╯

Where Brevia Helps Most

Large contexts (50k+ tokens): 76-97% savings with equal or better quality
Noisy contexts: Relevant info buried in boilerplate — Brevia extracts what matters
Multi-file contexts: Only sends relevant files to Claude

Where Brevia Does NOT Help

Tiny contexts (< 2k tokens): Passed through unchanged (no overhead)
Already-focused queries: If you're already sending only relevant code, nothing to cut
Full-file reasoning tasks: Some tasks need the entire file flow

Privacy & Security

Your API key is passed through — Brevia never stores it
Optimization happens locally — your code never leaves your machine
Telemetry is aggregated stats only: token counts, savings, request count
No content is ever sent to Brevia servers
Credentials stored in ~/.brevia/ with restricted permissions

Platform Support

macOS (Intel + Apple Silicon)
Linux (x86_64 + ARM64)
Windows 10+

Requires Python 3.10+.

Enterprise

Need team-wide deployment, custom optimization rules, or priority support?

Contact us: enterprise@brevia.dev

How It's Different

	Brevia	Prompt caching	Summarization
Approach	Algorithmic extraction	Cache repeated prefixes	LLM summarizes context
Cost	Zero (local CPU)	Reduced on cache hit	Adds an LLM call
Latency	~5ms	None on hit	+1-3s per call
Quality	Equal or better	Same	Often degrades
Works with	Any Anthropic tool	SDK only	Custom code only

Brevia stacks with prompt caching — use both for maximum savings.

License

MIT

Built by engineers who got tired of paying for noise.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.1

May 27, 2026

This version

0.1.0

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

breviadev-0.1.0.tar.gz (20.1 kB view details)

Uploaded May 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

breviadev-0.1.0-py3-none-any.whl (18.5 kB view details)

Uploaded May 27, 2026 Python 3

File details

Details for the file breviadev-0.1.0.tar.gz.

File metadata

Download URL: breviadev-0.1.0.tar.gz
Upload date: May 27, 2026
Size: 20.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`01aaeeac72d69f253e09cac02ad62ff64fd0ce78beb1e357b31dd9f9c4ad6a4e`
MD5	`c1b258563686ea95ce3f74f8e3696d78`
BLAKE2b-256	`9708241c6407dfabc11ca0b37823f0548cb6210c355f0f49607334cd22542c98`

See more details on using hashes here.

File details

Details for the file breviadev-0.1.0-py3-none-any.whl.

File metadata

Download URL: breviadev-0.1.0-py3-none-any.whl
Upload date: May 27, 2026
Size: 18.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e6cbe239d3afd9f90216b9a08d0c1a1d9be65f6a61073078c1ca8cec98e446ef`
MD5	`c79207067303bb8df08942c7be0c2b89`
BLAKE2b-256	`2b8a3761314c5888ee13b610f6a6988439c528b793c98ad6e8080f5724ca337c`

See more details on using hashes here.

breviadev 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Brevia

Quick Start

How It Works

Benchmarks

Real Code Analysis (4-Path Comparison)

Structured Data (50k token billing report)

Commands

What You'll See

Where Brevia Helps Most

Where Brevia Does NOT Help

Privacy & Security

Platform Support

Enterprise

How It's Different

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes