Strain your LLM context. Cut costs 70-95% by routing to cheaper models.
Project description
Colador
Spend less. Send less. Get smarter results.
Colador is a local proxy that sits between your coding tools and your LLM backends. It strains context down to what matters, routes each task to the cheapest model that can handle it, and escalates to frontier models only when justified.
The name comes from Spanish: colador means strainer. It lets only the important parts through.
The Problem
AI coding tools are expensive because they're wasteful. Cursor, Claude Code, Copilot, Aider — they all send bloated context to expensive models for every task. A function rename gets the same $75/M-token model as a concurrency bug. The result: $80–300/month in API costs, most of it on tasks a local model could handle in seconds.
How Colador Fixes It
┌──────────┐ ┌──────────────┐ ┌──────────┐
│ Your │ │ │ │ Local │
│ Coding │────▶│ Colador │────▶│ Models │
│ Tool │ │ :5757 │ │ (free) │
│ │ │ │ └──────────┘
└──────────┘ │ │
│ Strains │ ┌──────────┐
│ Routes │────▶│ Cloud │
│ Escalates │ │ APIs │
│ │ │ (paid) │
└──────────────┘ └──────────┘
Strain. Before any request leaves your machine, Colador compresses the context — extracting only relevant files, diffs, error output, and summaries. A typical 20K-token request becomes 2K tokens.
Route. A classifier assigns each task to a tier. Simple work (rename, find, test writing) stays local. Complex work (architecture, debugging, design) goes to frontier models. Most tasks are simple.
Escalate. For medium tasks, the local model does the work and a frontier model reviews a compressed diff. You pay frontier prices only for the review, not the execution.
Quick Start
pip install colador
colador init # generates config with detected local models
colador start # launches proxy on localhost:5757
Point your tool at http://localhost:5757/v1 and use it normally. No plugins, no IDE changes.
How It Works With Your Tools
Colador implements the OpenAI-compatible API. Any tool that can set a custom base URL works out of the box:
# Aider
aider --openai-api-base http://localhost:5757/v1 --openai-api-key colador
# Claude Code
export ANTHROPIC_BASE_URL="http://localhost:5757"
# Continue.dev / Cursor / Copilot
# Set base URL to http://localhost:5757/v1 in settings
Routing Policy
You can override routing with prefixes, or let Colador decide automatically:
@local rename getUserName to fetchUserName → stays local
@review add pagination to /users endpoint → local + supervisor review
@hard we're seeing race conditions in the queue → frontier model plans
Without a prefix, the classifier picks the tier based on the prompt.
| Task | Tier | What Happens |
|---|---|---|
| Find where X is implemented | Local | Local model only |
| Rename this method | Local | Local model only |
| Write tests for this module | Local | Local model only |
| Add a feature with clear spec | Review | Local writes, frontier reviews |
| Refactor across a few modules | Review | Local writes, frontier reviews |
| Architecture decision | Hard | Frontier plans, local executes |
| Debug a concurrency issue | Hard | Frontier plans, local executes |
Configuration
# ~/.colador/config.yaml
backends:
local:
url: "http://localhost:11434/v1"
model: "qwen3-coder-next"
api_key: "ollama"
supervisor:
provider: "anthropic"
model: "claude-sonnet-4-20250514"
api_key_env: "ANTHROPIC_API_KEY"
routing:
default_tier: "auto"
classifier: "hybrid"
tier1:
backend: "local"
tier2:
worker: "local"
reviewer: "supervisor"
max_review_tokens: 3000
tier3:
planner: "supervisor"
executor: "local"
No Local Models? Still Saves Money
You don't need Ollama or local hardware. Colador works with any combination of backends:
backends:
cheap:
provider: "openrouter"
model: "google/gemma-4" # ~$0.10/M tokens
smart:
provider: "anthropic"
model: "claude-sonnet-4" # ~$15/M tokens
Context compression alone cuts your spend significantly, even when both backends are cloud APIs.
Transparency
Every routing decision is logged locally. See what was sent, why it was sent, and what it cost:
colador logs # recent routing decisions
colador stats # aggregate savings
{
"timestamp": "2026-04-12T14:30:00Z",
"tier": "TIER_1",
"reason": "prompt matched rule: 'rename'",
"backend": "local",
"tokens_in": 342,
"tokens_out": 128,
"estimated_cost_usd": 0.0,
"latency_ms": 890
}
Project Docs
| Document | Purpose |
|---|---|
| product.md | Vision, positioning, differentiators |
| architecture.md | System design, modules, schemas, data flow |
| plan.md | Phased build plan with deliverables |
| agents.md | Rules for contributors (human and AI) |
Contributing
Colador is open source. Before contributing, read agents.md — it covers the file header rule, module boundaries, and coding conventions.
Every file must start with a comment answering: WHY does this file exist, WHAT does it do, HOW is it used. If you can't write the header, the file probably shouldn't exist.
License
TBD
Status
Pre-release. Under active development. See plan.md for the current phase.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file colador-0.0.1.tar.gz.
File metadata
- Download URL: colador-0.0.1.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fb894f82344e6a57f3a7c7857d587f3490e7c497bbdc6e989511db58c72488a4
|
|
| MD5 |
46fe44bbd87535e98a7d92ebce531932
|
|
| BLAKE2b-256 |
6c99bae2437d2b72049bc3d576f36727cf2872c20aade0b12143539a03276d2a
|
File details
Details for the file colador-0.0.1-py3-none-any.whl.
File metadata
- Download URL: colador-0.0.1-py3-none-any.whl
- Upload date:
- Size: 4.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e48b33f99511f8c264c0e1cb1f1a4a66c7e05baa3fc3b966525dafc5079fc2f
|
|
| MD5 |
8a6901ca14390679161baf6882c2c03e
|
|
| BLAKE2b-256 |
6a5ab7489fd569cfb46bbfeecc98cbd5fd0fed327a3445e6e2d9490db5d21b9c
|