Skip to main content

Save 55-93% on Claude API tokens. Same quality, fraction of the cost.

Project description

Brevia

Use Claude smarter. Save 55-93% on tokens without losing quality.

Brevia is a local proxy that makes your Claude API calls cheaper and often better. It runs between your tools and Anthropic, automatically trimming unnecessary context so Claude focuses on what actually matters.

Works with Claude Code, Cursor, Continue, aider, and anything that uses the Anthropic SDK. No code changes needed.


Quick Start

pip install breviadev
brevia login          # Opens browser — sign in with GitHub or Google
brevia serve          # Starts on localhost:8420

Then add to your shell profile (~/.zshrc, ~/.bashrc):

export ANTHROPIC_BASE_URL=http://localhost:8420

That's it. Everything works exactly as before — but cheaper and often better.


How It Works

Your Tool (Claude Code, Cursor, etc.)
    │
    │  ANTHROPIC_BASE_URL=http://localhost:8420
    ▼
┌─────────────────────────────────┐
│           Brevia                │  ← Runs on your machine
│                                 │
│  Reads your request, figures    │
│  out what's relevant, removes   │
│  the noise, and enhances the    │
│  prompt for better results.     │
└─────────────────────────────────┘
    │
    │  Smaller, focused payload
    ▼
┌─────────────────────────────────┐
│      api.anthropic.com          │  ← Your API key
│                                 │     Your account
│  Claude gets less noise,        │     You pay less
│  gives better answers.          │
└─────────────────────────────────┘
    │
    │  Response streams back
    ▼
Your Tool (unchanged behavior)

Why it works: When Claude gets 2k tokens of the right code instead of 95k tokens of everything, it gives more precise answers. Less noise in, better signal out.


Benchmarks

Tested on real codebases (Django, FastAPI, psf/requests) with Claude Opus:

Context Size Cost Savings Quality
< 2k tokens 0% (passthrough) Same
10k tokens ~55% Same
50k tokens 76-93% Same
95k tokens 76% Better (more precise)

Real-World Example

Setup Total Cost Compared to Direct
Direct Claude (full context) $0.836
With Brevia $0.278 67% cheaper

Large Data (50k token input)

Metric Value
Token reduction 97%
Cost savings 93%
Accuracy Perfect (found all issues)

Full benchmark details: benchmarks/BENCHMARKS.md


Commands

Command What it does
brevia login Sign in (opens browser)
brevia serve Start Brevia
brevia serve -p 9000 Start on a different port
brevia stats See how much you've saved
brevia stats -d 30 See last 30 days
brevia logout Sign out

What You'll See

When brevia serve is running:

╭─ Brevia ─────────────────────────────────────╮
│ Brevia is running                             │
│                                               │
│   Address:  http://127.0.0.1:8420             │
│   User:     @yourname                         │
│   Status:   Active                            │
│                                               │
│   Add to your shell:                          │
│   export ANTHROPIC_BASE_URL=http://127.0.0.1:8420 │
╰───────────────────────────────────────────────╯

Check your savings anytime with brevia stats:

╭─ Brevia Stats ────────────────────────────────╮
│                                               │
│   Days active:    12                          │
│   Total requests: 847                         │
│   Tokens saved:   4,230,000                   │
│   Avg reduction:  71%                         │
│   Est. $ saved:   $63.45                      │
╰───────────────────────────────────────────────╯

Where It Helps Most

  • Big contexts (50k+ tokens): The more noise, the more Brevia saves
  • Multi-file projects: Keeps only the files that matter for your question
  • Repetitive code: Strips boilerplate so Claude focuses on the real problem

Where It Doesn't Help

  • Short prompts (< 2k tokens): Already small — Brevia passes these through unchanged
  • Already focused: If you're manually sending only relevant code, there's nothing to trim

Privacy & Security

  • Your API key stays yours — Brevia passes it through, never stores it
  • Your code stays local — nothing leaves your machine
  • We only collect usage stats — token counts and savings, never content
  • Credentials are stored locally in ~/.brevia/ with restricted file permissions

Platforms

  • macOS (Intel + Apple Silicon)
  • Linux (x86_64 + ARM64)
  • Windows 10+

Requires Python 3.10+.


Enterprise

Need team-wide deployment, custom rules, or dedicated support?

Contact us: enterprise@brevia.dev


How It Compares

Brevia Prompt Caching Manual Trimming
Setup One command Built into SDK You do it yourself
Effort Zero — automatic Zero — automatic High — manual work
Savings 55-93% Varies (cache hits only) Depends on you
Quality Same or better Same Risk of cutting too much
Works with Any Anthropic tool SDK only Your code only

Brevia works alongside prompt caching — use both for maximum savings.


License

Proprietary. Free for individual use. See LICENSE for details.


Built for developers who'd rather spend money on building, not on sending noise to an API.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

breviadev-0.1.1.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

breviadev-0.1.1-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file breviadev-0.1.1.tar.gz.

File metadata

  • Download URL: breviadev-0.1.1.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9525de966ea4851a3d559eff0881f56a106c8d54cb51d23b864ea7df5b1d797c
MD5 1d82b6b4323f22c3cc30aa1c02480932
BLAKE2b-256 224c7e05fffc4e6fa0a1f53f5aad1c4a0eab9c288918e1038fe138c5c7c8e604

See more details on using hashes here.

File details

Details for the file breviadev-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: breviadev-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for breviadev-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ec05949d1be936d17b3035acbfbd683c7baca676468908ee6af8b491b3c08d6f
MD5 f1d8d34316c78635f5e87c756852bbde
BLAKE2b-256 a354708c6a609d0d453b8141610f3157243e8d8095117ff2b5b49bb3c2f26c73

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page