Save 55-93% on Claude API tokens. Same quality, fraction of the cost.

These details have not been verified by PyPI

Project links

Project description

Brevia

Use Claude smarter. Save 55-93% on tokens without losing quality.

Brevia is a local proxy that makes your Claude API calls cheaper and often better. It runs between your tools and Anthropic, automatically trimming unnecessary context so Claude focuses on what actually matters.

Works with Claude Code, Cursor, Continue, aider, and anything that uses the Anthropic SDK. No code changes needed.

Quick Start

pip install breviadev
brevia login          # Opens browser — sign in with GitHub or Google
brevia serve          # Starts on localhost:8420

Then add to your shell profile (~/.zshrc, ~/.bashrc):

export ANTHROPIC_BASE_URL=http://localhost:8420

That's it. Everything works exactly as before — but cheaper and often better.

How It Works

Your Tool (Claude Code, Cursor, etc.)
    │
    │  ANTHROPIC_BASE_URL=http://localhost:8420
    ▼
┌─────────────────────────────────┐
│           Brevia                │  ← Runs on your machine
│                                 │
│  Reads your request, figures    │
│  out what's relevant, removes   │
│  the noise, and enhances the    │
│  prompt for better results.     │
└─────────────────────────────────┘
    │
    │  Smaller, focused payload
    ▼
┌─────────────────────────────────┐
│      api.anthropic.com          │  ← Your API key
│                                 │     Your account
│  Claude gets less noise,        │     You pay less
│  gives better answers.          │
└─────────────────────────────────┘
    │
    │  Response streams back
    ▼
Your Tool (unchanged behavior)

Why it works: When Claude gets 2k tokens of the right code instead of 95k tokens of everything, it gives more precise answers. Less noise in, better signal out.

Benchmarks

Tested on real codebases (Django, FastAPI, psf/requests) with Claude Opus:

Context Size	Cost Savings	Quality
< 2k tokens	0% (passthrough)	Same
10k tokens	~55%	Same
50k tokens	76-93%	Same
95k tokens	76%	Better (more precise)

Real-World Example

Setup	Total Cost	Compared to Direct
Direct Claude (full context)	$0.836	—
With Brevia	$0.278	67% cheaper

Large Data (50k token input)

Metric	Value
Token reduction	97%
Cost savings	93%
Accuracy	Perfect (found all issues)

Full benchmark details: benchmarks/BENCHMARKS.md

Commands

Command	What it does
`brevia login`	Sign in (opens browser)
`brevia serve`	Start Brevia
`brevia serve -p 9000`	Start on a different port
`brevia stats`	See how much you've saved
`brevia stats -d 30`	See last 30 days
`brevia logout`	Sign out

What You'll See

When brevia serve is running:

╭─ Brevia ─────────────────────────────────────╮
│ Brevia is running                             │
│                                               │
│   Address:  http://127.0.0.1:8420             │
│   User:     @yourname                         │
│   Status:   Active                            │
│                                               │
│   Add to your shell:                          │
│   export ANTHROPIC_BASE_URL=http://127.0.0.1:8420 │
╰───────────────────────────────────────────────╯

Check your savings anytime with brevia stats:

╭─ Brevia Stats ────────────────────────────────╮
│                                               │
│   Days active:    12                          │
│   Total requests: 847                         │
│   Tokens saved:   4,230,000                   │
│   Avg reduction:  71%                         │
│   Est. $ saved:   $63.45                      │
╰───────────────────────────────────────────────╯

Where It Helps Most

Big contexts (50k+ tokens): The more noise, the more Brevia saves
Multi-file projects: Keeps only the files that matter for your question
Repetitive code: Strips boilerplate so Claude focuses on the real problem

Where It Doesn't Help

Short prompts (< 2k tokens): Already small — Brevia passes these through unchanged
Already focused: If you're manually sending only relevant code, there's nothing to trim

Privacy & Security

Your API key stays yours — Brevia passes it through, never stores it
Your code stays local — nothing leaves your machine
We only collect usage stats — token counts and savings, never content
Credentials are stored locally in ~/.brevia/ with restricted file permissions

Platforms

macOS (Intel + Apple Silicon)
Linux (x86_64 + ARM64)
Windows 10+

Requires Python 3.10+.

Enterprise

Need team-wide deployment, custom rules, or dedicated support?

Contact us: enterprise@brevia.dev

How It Compares

	Brevia	Prompt Caching	Manual Trimming
Setup	One command	Built into SDK	You do it yourself
Effort	Zero — automatic	Zero — automatic	High — manual work
Savings	55-93%	Varies (cache hits only)	Depends on you
Quality	Same or better	Same	Risk of cutting too much
Works with	Any Anthropic tool	SDK only	Your code only

Brevia works alongside prompt caching — use both for maximum savings.

License

Proprietary. Free for individual use. See LICENSE for details.

Built for developers who'd rather spend money on building, not on sending noise to an API.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

May 27, 2026

0.1.0

May 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

breviadev-0.1.1.tar.gz (19.2 kB view details)

Uploaded May 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

breviadev-0.1.1-py3-none-any.whl (17.6 kB view details)

Uploaded May 27, 2026 Python 3

File details

Details for the file breviadev-0.1.1.tar.gz.

File metadata

Download URL: breviadev-0.1.1.tar.gz
Upload date: May 27, 2026
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`9525de966ea4851a3d559eff0881f56a106c8d54cb51d23b864ea7df5b1d797c`
MD5	`1d82b6b4323f22c3cc30aa1c02480932`
BLAKE2b-256	`224c7e05fffc4e6fa0a1f53f5aad1c4a0eab9c288918e1038fe138c5c7c8e604`

See more details on using hashes here.

File details

Details for the file breviadev-0.1.1-py3-none-any.whl.

File metadata

Download URL: breviadev-0.1.1-py3-none-any.whl
Upload date: May 27, 2026
Size: 17.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ec05949d1be936d17b3035acbfbd683c7baca676468908ee6af8b491b3c08d6f`
MD5	`f1d8d34316c78635f5e87c756852bbde`
BLAKE2b-256	`a354708c6a609d0d453b8141610f3157243e8d8095117ff2b5b49bb3c2f26c73`

See more details on using hashes here.

breviadev 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Brevia

Quick Start

How It Works

Benchmarks

Real-World Example

Large Data (50k token input)

Commands

What You'll See

Where It Helps Most

Where It Doesn't Help

Privacy & Security

Platforms

Enterprise

How It Compares

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes