File-based model router for LLM cost optimization. Zero dependencies.

These details have not been verified by PyPI

Project links

Project description

Antaris Router

Deterministic model routing for 50-70% LLM cost reduction. Zero dependencies.

File-based prompt classification that routes to the cheapest capable model. Same input always produces the same routing decision. No API calls for classification, no vector databases, no infrastructure overhead.

Cost Impact

Real savings from production usage:

GPT-4o for everything:     $847.20/month
With antaris-router:       $251.15/month  
Savings:                   $596.05 (70.3%)

Most applications waste money by using expensive models for simple tasks. This tool automatically routes prompts to the cheapest model that can handle the complexity level.

How It Works

Classify prompts using deterministic keyword matching + structural analysis
Route to cheapest model in each capability tier (trivial → simple → moderate → complex → expert)
Track actual usage costs and compare against premium-only baseline
Optimize spending while maintaining output quality

All routing decisions happen offline using plain text rules stored in JSON files.

What It Does

Prompt complexity classification (5 tiers: trivial → expert)
Cost-optimized model selection within each tier
Usage tracking with savings estimates vs. premium models
Provider preferences and capability-based routing
Deterministic decisions — same prompt always routes the same way

What It Doesn't Do

API proxy — Returns routing decisions only, you make the actual calls
Semantic analysis — Uses keyword matching, not embeddings or model inference
Learning system — Rules are static, doesn't adapt based on outcomes
Rate limiting — Handles routing logic only, not request management
Quality assessment — Assumes all models in a tier produce equivalent results

Technical Approach

Same principles as antaris-memory:

Principle	Implementation
File-based	JSON config files. No databases, no external services.
Deterministic	Identical inputs produce identical routing decisions.
Offline-first	Classification runs locally using keyword matching.
Zero dependencies	Pure Python stdlib. No vendor lock-in.
Transparent	Inspect routing rules with any text editor.

Install

pip install antaris-router

Usage

from antaris_router import Router

# Initialize with default config  
router = Router()

# Route prompts to appropriate models
simple_q = router.route("What is Python?")
# → gpt-4o-mini ($0.15/MTok) instead of gpt-4o ($2.50/MTok)

architecture = router.route("""
Design a microservices architecture for handling 
100k concurrent users with Redis caching...
""")  
# → claude-sonnet ($3/MTok) instead of opus ($15/MTok)

# Log actual usage for cost tracking
router.log_usage(simple_q, input_tokens=12, output_tokens=150, actual_cost=0.0024)

# View savings report
savings = router.savings_estimate()
print(f"This month: ${savings['period_cost']:.2f}")
print(f"Without router: ${savings['baseline_cost']:.2f}")  
print(f"Saved: ${savings['total_savings']:.2f} ({savings['savings_percent']:.1f}%)")

Classification System

5 tiers from cheapest to most expensive:

Tier	Cost Range	Use Cases
Trivial	$0.10-0.20/MTok	Greetings, confirmations, simple Q&A
Simple	$0.15-0.50/MTok	Factual lookup, basic explanations
Moderate	$1.00-3.00/MTok	Analysis, summarization, structured data
Complex	$2.50-15.0/MTok	Code generation, technical design
Expert	$15.0-75.0/MTok	Novel research, creative problem solving

Classification signals:

Presence of technical keywords (API, algorithm, architecture)
Prompt length and structural complexity (code blocks, numbered lists)
Explicit complexity markers (explain in detail, comprehensive analysis)

Not semantic understanding — Uses pattern matching, not AI classification.

When This Works

Good fit:

High-volume applications with mixed complexity (customer support, content generation)
Budget-conscious teams that need predictable routing decisions
Workflows where 80% of prompts are routine, 20% need premium models
Integration into existing codebases without infrastructure changes

Not a good fit:

Single-model applications (no cost optimization opportunity)
Highly specialized domains where complexity classification fails
Real-time applications needing sub-10ms routing decisions
Teams that prefer semantic similarity over keyword matching

Limitations

Pattern-based only — Misclassifies prompts that don't match keyword patterns
No quality feedback — Doesn't learn if cheaper models produce poor results
Static rules — Classification logic doesn't adapt to your specific use case
English-optimized — Keyword matching may not work well for other languages
No model performance tracking — Assumes all models in a tier are equivalent

If you need semantic classification or quality-based routing, this tool isn't suitable.

Configuration

The router uses JSON files for all configuration. Defaults work for most use cases.

Customize model costs:

# Edit config/models.json to add new models or update pricing
vim config/models.json

Adjust classification rules:

# Modify config/classification.json to tune keyword matching
vim config/classification.json

Track usage:

# Cost tracking happens automatically
report = router.cost_report()
print(f"Monthly cost: ${report['total_cost']:.2f}")
print(f"Requests routed: {report['total_requests']:,}")

All configuration files use plain JSON — no proprietary formats or complex schemas. }


## Storage Format

Router state and cost tracking data are stored in JSON:

```json
{
  "version": "1.0.0",
  "saved_at": "2026-02-15T14:30:00",
  "usage_history": [
    {
      "timestamp": "2026-02-15T10:00:00",
      "model_name": "gpt-4o-mini",
      "tier": "simple",
      "input_tokens": 50,
      "output_tokens": 30,
      "actual_cost": 0.0000825,
      "routing_confidence": 0.87
    }
  ]
}

Architecture

Simple 4-component design:

TaskClassifier — Prompt → complexity tier
ModelRegistry — Model definitions and costs
CostTracker — Usage logging and savings calculation
Router — Combines everything, returns routing decisions

Data flow: prompt → classify → find cheapest model for tier → return decision

Related Tools

antaris-memory — File-based persistent memory for AI agents
OpenRouter, LiteLLM — Full model proxies (require API keys, network calls)
LangChain — Agent framework (uses model inference for routing)

Development

# Run tests
python -m pytest tests/ -v

# Install development dependencies  
pip install -e .[dev]

# Type checking
mypy antaris_router/

License

Apache License 2.0. See LICENSE for details.

Part of Antaris Analytics — File-based tools for deterministic AI applications.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

5.0.1

Mar 10, 2026

4.9.20

Mar 8, 2026

4.9.18

Mar 7, 2026

4.9.17

Mar 7, 2026

4.9.16

Mar 6, 2026

4.9.15

Mar 6, 2026

4.9.14

Mar 5, 2026

4.9.13

Mar 5, 2026

4.9.12

Mar 5, 2026

4.9.11

Mar 5, 2026

4.9.10

Mar 4, 2026

4.9.5

Mar 3, 2026

4.9.4

Mar 3, 2026

4.9.3

Mar 3, 2026

4.9.2

Mar 3, 2026

4.9.1

Mar 3, 2026

4.9.0

Mar 3, 2026

4.8.0

Mar 3, 2026

4.7.1

Mar 3, 2026

4.7.0

Mar 3, 2026

4.6.8

Mar 2, 2026

4.6.6

Mar 2, 2026

4.6.5

Mar 2, 2026

4.6.0

Mar 2, 2026

4.5.3

Mar 1, 2026

4.5.2

Mar 1, 2026

4.2.0

Feb 27, 2026

4.1.0

Feb 21, 2026

4.0.3

Feb 26, 2026

4.0.1

Feb 23, 2026

4.0.0

Feb 21, 2026

3.3.0

Feb 21, 2026

3.0.1

Feb 20, 2026

3.0.0

Feb 19, 2026

2.0.0

Feb 16, 2026

This version

0.3.0

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

antaris_router-0.3.0.tar.gz (28.8 kB view details)

Uploaded Feb 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

antaris_router-0.3.0-py3-none-any.whl (24.2 kB view details)

Uploaded Feb 16, 2026 Python 3

File details

Details for the file antaris_router-0.3.0.tar.gz.

File metadata

Download URL: antaris_router-0.3.0.tar.gz
Upload date: Feb 16, 2026
Size: 28.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_router-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`c80c750a85792f1692f6a31fe4a492d1be36278e4c001df783ad06c6b37047cc`
MD5	`55cc07b043a99b75158c9779aad07c50`
BLAKE2b-256	`67f59625acee35620919a94c897f3c61de6b0cf1219ad73bc536483044142494`

See more details on using hashes here.

File details

Details for the file antaris_router-0.3.0-py3-none-any.whl.

File metadata

Download URL: antaris_router-0.3.0-py3-none-any.whl
Upload date: Feb 16, 2026
Size: 24.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for antaris_router-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2a1e1ab5dbe43b83a3e3fbccd4af3d4bc6e08e44dd428e90f6e3b330f4347068`
MD5	`6a934afaba599e732fcb1a83fa2e59e6`
BLAKE2b-256	`ae68adba9f5f2d1263974c4cefc4e5a531fe844ac17d887e61e01aab2293a705`

See more details on using hashes here.

antaris-router 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Antaris Router

Cost Impact

How It Works

What It Does

What It Doesn't Do

Technical Approach

Install

Usage

Classification System

When This Works

Limitations

Configuration

Architecture

Related Tools

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes