Skip to main content

Budget-Aware Agentic Routing — route LLM calls intelligently between cheap and powerful models with a hard budget cap.

Project description

baar-core (BAAR-Algo)

PyPI version License: MIT Python 3.10+ Scientific Validation

Route LLM calls intelligently between cheap and powerful models — with a hard financial kill-switch that never breaks.


🚀 Why BAAR?

Every agent developer using GPT-4o has seen this:

  • Simple task → sent to GPT-4o anyway → 15× more expensive than necessary.
  • Budget set to $0.10 → agent burns $0.40 → surprise invoice.
  • No visibility into which agent step cost what, or why.

BAAR (Budget-Aware Agentic Routing) solves this at the protocol level.


🧠 How it Works

BAAR acts as a semantic gateway between your application and the LLM providers.

graph TD
    A[Your Task] --> B{Semantic Router}
    B -- Complexity < 0.65 --> C[gpt-4o-mini]
    B -- Complexity >= 0.65 --> D[Budget Kill-Switch]
    
    D -- "Affordable?" --> E[gpt-4o]
    D -- "Too Expensive" --> F[Force Downgrade to Mini]
    
    E --> G[Audit & Spend Tracking]
    F --> G
    C --> G
    
    G --> H[Final Response]
  1. Semantic Scoring: Uses a cheap model to score task complexity (0.0–1.0).
  2. BCD (Budget-Constrained Decoding): If the powerful model is too expensive for your remaining budget, BAAR automatically downgrades to a cheaper one to ensure the task completes without an overage.
  3. Local Rejection: If even the cheapest model exceeds the budget, the request is rejected locally with zero network cost.

🔬 Benchmarking Results

To ensure frontier-grade quality, BAAR-Algo is validated on industry-standard datasets.

Dataset Strategy Accuracy % Cost (USD) Savings vs BIG
MMLU ALWAYS-BIG 100.0% $0.0905 -
(Knowledge) BAAR-Algo 70.0% $0.0050 93.3%
GSM8K ALWAYS-BIG 100.0% $0.0905 -
(Math) BAAR-Algo 80.0% $0.0050 93.3%
HumanEval ALWAYS-BIG 100.0% $0.0105 -
(Coding) BAAR-Algo 100.0% $0.0105 0.0%*

*On HumanEval, BAAR correctly detects 100% complexity and uses the Big model, ensuring zero quality loss for critical code.

Run the Benchmark Yourself (Free)

baar-bench --dataset all --mock

📦 Installation

pip install baar-core

⚡ Quick Start

from baar import BAARRouter

# Set a hard $0.10 budget cap
router = BAARRouter(budget=0.10)

# This will be routed to gpt-4o-mini (Complexity ~0.1)
response = router.chat("What is the capital of France?")

# This will be routed to gpt-4o (Complexity ~0.9)
code = router.chat("Write a complex matrix multiplication in CUDA.")

🛡️ Resilience & Security

BAAR is designed for Financial Safety (Anti-Denial of Wallet).

Attack Vector BAAR Response Proof
Unbounded Consumption Zero-Call Rejection Blocks request locally with Zero network calls.
Complexity Inflation Semantic Scoring Ignores gibberish/padding intended to drain budget.
Sensitivity Toggling Tunable Threshold Adjust complexity_threshold to match your quality needs.

Verify resilience locally:

baar-stress

🛠️ Configuration

router = BAARRouter(
    budget=0.10,                    # Hard cap in USD
    small_model="gpt-4o-mini",      # Cheap model
    big_model="gpt-4o",             # Powerful model
    complexity_threshold=0.65,      # 0–1: above this → use big model
)

📄 License & Research

Distributed under the MIT License. See LICENSE for more information.

For architectural details and mapping to the OWASP LLM10 security framework, see RESEARCH.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

baar_core-0.1.4.tar.gz (21.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

baar_core-0.1.4-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file baar_core-0.1.4.tar.gz.

File metadata

  • Download URL: baar_core-0.1.4.tar.gz
  • Upload date:
  • Size: 21.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for baar_core-0.1.4.tar.gz
Algorithm Hash digest
SHA256 a4981069c87188620ab72cf9f8b9288c25f5649179baae694bb8b84c20a1ce14
MD5 5fe4a67ce82d18faa84ba96ddc622098
BLAKE2b-256 efcf56948bfaaeef7ee29694dcd918decac82c054e10018485f67343b2771459

See more details on using hashes here.

File details

Details for the file baar_core-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: baar_core-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 14.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for baar_core-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f77052a3b464394fbf167da75b370bb0abc2a3cd8c48134e79c7c58981c2caf7
MD5 ee2cdbb4336426531f9c8c4c557aa7b1
BLAKE2b-256 46a3de37b7e3d22f6e1a6086be4ea9484fd115627b2431df286b470644ac5cae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page