LLM API cost interceptor and budget enforcer for AI agents

These details have not been verified by PyPI

Project description

agentcents

LLM API cost tracking proxy and budget enforcement for AI agents.

Drop agentcents between your agent and any LLM provider. It tracks every call, enforces budgets, caches responses, and tells you exactly where your money is going — across cloud APIs and local models.

Your Agent  →  agentcents proxy (localhost:8082)  →  OpenAI / Anthropic / Ollama

No code changes required. Just point your LLM client at the proxy.

Install

pip install agentcents

Pro features require a license key from labhamfounder.gumroad.com/l/agentcents-pro.

What to expect

Zero configuration to get started. Install, start the proxy, point your LLM client at it — that's it.

Step 1 — pip install agentcents          (one time)
Step 2 — agentcents start                (once per session)
Step 3 — point your LLM client at it     (one header change)
Step 4 — agentcents usage                (see your costs)

No API keys, no accounts, no signup required for the free tier.

Configuration is optional — only add ~/.agentcents.toml when you want:

You want...	What to add
Hard budget limits	`[budgets] daily = 5.00`
Routing warnings when budget runs low	`[routing] threshold_pct = 80`
Track local Ollama power costs	`[local] gpu_watts = 40`
Separate costs per agent	`X-Agentcents-Tag` header on each call

Pricing data syncs automatically on proxy startup — you never need to run agentcents sync manually unless you want to force a refresh after a provider announces new models.

Pro license — activate once per machine:

agentcents activate <your-key>

Pro features are then available immediately. No restart needed.

Quick Start

1. Start the proxy

agentcents start

2. Point your LLM client at the proxy

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:8082",
    default_headers={"X-Agentcents-Target": "https://api.anthropic.com"},
)

3. Check your costs

agentcents usage
agentcents recent

That's it. Every call is now tracked.

Configuration

Create ~/.agentcents.toml to configure budgets, routing, and local models.

# ~/.agentcents.toml

# ── Budgets ────────────────────────────────────────────────────────────────
[budgets]
daily   = 5.00    # hard block at $5/day across all calls
monthly = 50.00   # used by `agentcents rolling` reporting

# Per-tag daily budgets (optional)
[budgets.tags.my-agent]
daily = 1.00

[budgets.tags.research]
daily = 2.00

# ── Auto-routing ───────────────────────────────────────────────────────────
[routing]
mode           = "warn"   # "warn" — log suggestion only
                          # "swap" — silently swap model (Pro)
                          # "off"  — disable routing
threshold_pct  = 80       # trigger when X% of daily budget is used
skip_tool_use  = true     # never swap requests that use tools

# ── Local Models (Ollama) ──────────────────────────────────────────────────
[local]
gpu_watts        = 40     # your GPU/chip TDP in watts
                          # M1 Max ≈ 40W, M2 Ultra ≈ 60W, RTX 4090 ≈ 450W
electricity_rate = 0.12   # $/kWh — check your electricity bill
ollama_base_url  = "http://localhost:11434"

# ── Advisor ────────────────────────────────────────────────────────────────
[advisor]
min_saving_pct = 20       # only suggest swaps that save ≥ 20%

Budget behavior

Spend vs budget	Action
0–80%	Normal
80%+	⚠ ROUTING WARN logged, `X-Agentcents-Suggest` header added
100%+	429 `budget_exceeded` returned, call blocked

Request Headers

Add these headers to your LLM client requests to control agentcents behavior.

Header	Required	Example	Description
`X-Agentcents-Target`	Yes	`https://api.anthropic.com`	Provider base URL to forward to
`X-Agentcents-Tag`	No	`my-agent`	Group calls for cost reporting
`X-Agentcents-Session`	No	`agent-run-42`	Track individual agent sessions
`X-Agentcents-Cache`	No	`off`	Disable cache for this request
`X-Agentcents-Cache`	No	`exact`	Exact-match cache only, skip semantic

Examples

# Tag calls by project
client = anthropic.Anthropic(
    base_url="http://localhost:8082",
    default_headers={
        "X-Agentcents-Target":  "https://api.anthropic.com",
        "X-Agentcents-Tag":     "research-agent",
        "X-Agentcents-Session": "run-001",
    },
)

# Disable cache for a specific call
response = client.messages.create(
    model="claude-3-haiku-20240307",
    max_tokens=100,
    messages=[...],
    extra_headers={"X-Agentcents-Cache": "off"},
)

Response headers

Header	Description
`X-Agentcents-Cache: exact-hit`	Response served from exact-match cache
`X-Agentcents-Cache: semantic-hit`	Response served from semantic cache (Pro)
`X-Agentcents-Suggest: <model>`	Cheaper model suggested (routing warn)
`X-Agentcents-Routed: <model>`	Model was swapped to this (routing swap, Pro)

Local Models (Ollama)

Route Ollama calls through agentcents to track GPU power costs alongside cloud API costs.

Start Ollama normally:

ollama serve

Point your Ollama client at the proxy:

# Instead of http://localhost:11434
# Use    http://localhost:8082/ollama

curl http://localhost:8082/ollama/api/chat -d '{
  "model": "llama3:8b",
  "stream": false,
  "messages": [{"role": "user", "content": "hello"}]
}'

Or use the OpenAI-compatible endpoint:

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8082/ollama/v1",
    api_key="ollama",
)

Power cost is estimated as:

cost = (inference_seconds / 3600) × gpu_watts × electricity_rate

Configure gpu_watts and electricity_rate in ~/.agentcents.toml.

CLI Reference

agentcents <command> [options]

Cost reporting

agentcents usage                    # cost summary last 24h
agentcents usage --hours 168        # last 7 days
agentcents usage --tag my-agent     # filter by tag

agentcents recent                   # last 20 individual calls
agentcents recent --n 50            # last 50 calls

agentcents rolling                  # 30-day rolling spend
agentcents rolling --days 7         # 7-day rolling spend

agentcents agents                   # per-agent/session breakdown
agentcents agents --hours 48        # last 48h

agentcents local                    # local vs cloud cost comparison

Live monitoring

agentcents watch                    # live tail of calls (Pro)
agentcents watch --poll 1           # refresh every 1 second
agentcents dashboard                # full TUI dashboard (Pro)

Budget alerts

agentcents alerts                   # recent budget alerts
agentcents alerts --n 50            # last 50 alerts

Catalog & models

agentcents models                   # list all models with pricing
agentcents sync                     # force sync pricing + chains

Intelligence (Pro)

agentcents suggest                  # model swap suggestions based on usage
agentcents suggest --hours 168      # based on last 7 days
agentcents train                    # train XGBoost cost predictor

License

agentcents activate <key>           # activate Pro license
agentcents deactivate               # remove Pro license
agentcents features                 # show available features

Pro Features

Feature	Free	Pro
Proxy + cost logging	✓	✓
Exact-match cache	✓	✓
Budget alerts + hard block	✓	✓
CLI reporting	✓	✓
Web dashboard	✓	✓
Local Ollama tracking	✓	✓
Semantic similarity cache	—	✓
Multi-agent TUI dashboard	—	✓
Live watch	—	✓
Model swap advisor	—	✓
Auto-routing (swap mode)	—	✓
XGBoost cost predictor	—	✓

Get Pro at labhamfounder.gumroad.com/l/agentcents-pro.

Supported Providers

Any provider that speaks the OpenAI API format:

Provider	Target URL
Anthropic	`https://api.anthropic.com`
OpenAI	`https://api.openai.com`
Google Gemini	`https://generativelanguage.googleapis.com`
OpenRouter	`https://openrouter.ai/api`
Groq	`https://api.groq.com/openai`
Ollama	via `/ollama` route (no header needed)

Sync

agentcents keeps two files updated in ~/.agentcents/:

File	Contents	Source
`models.json`	Model pricing ($/M tokens)	OpenRouter + LiteLLM
`chains.json`	Downgrade chains for routing	labham.com

These update in two ways:

Proxy startup — if files are older than 24h, the proxy fetches fresh data automatically when you run agentcents start
Manual — run agentcents sync any time to force an update

agentcents sync
# Syncing pricing catalog...
# Chains updated to v1.0.1
# Done.

Why this matters: Anthropic and OpenAI release new models frequently. Without syncing, agentcents may not recognize new model IDs or have accurate pricing. Run agentcents sync after any major provider announcement.

If sync fails (no internet, server down), agentcents falls back to the bundled data/chains.json and data/fallback.json that shipped with the package.

Architecture

~/.agentcents.toml          — budgets, routing, local config
~/.agentcents/models.json   — pricing catalog (auto-updated)
~/.agentcents/chains.json   — downgrade chains (auto-updated)
~/.agentcents/ledger.db     — all call records (SQLite)

The proxy runs entirely locally. No call data leaves your machine. Pricing data syncs from OpenRouter and LiteLLM APIs. License validation calls agentcents-license.labham.workers.dev.

License

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.22

Mar 5, 2026

0.1.21

Mar 2, 2026

0.1.18

Mar 2, 2026

0.1.17

Mar 2, 2026

0.1.16

Mar 1, 2026

0.1.15

Mar 1, 2026

0.1.14

Mar 1, 2026

0.1.13

Mar 1, 2026

0.1.12

Mar 1, 2026

0.1.11

Mar 1, 2026

0.1.10

Mar 1, 2026

0.1.9

Mar 1, 2026

0.1.8

Mar 1, 2026

0.1.7

Mar 1, 2026

0.1.6

Mar 1, 2026

0.1.5

Mar 1, 2026

0.1.4

Mar 1, 2026

0.1.3

Mar 1, 2026

0.1.2

Feb 28, 2026

0.1.1

Feb 28, 2026

0.1.0

Feb 28, 2026

0.0.1

Feb 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

agentcents-0.1.22-py3-none-any.whl (49.5 kB view details)

Uploaded Mar 5, 2026 Python 3

File details

Details for the file agentcents-0.1.22-py3-none-any.whl.

File metadata

Download URL: agentcents-0.1.22-py3-none-any.whl
Upload date: Mar 5, 2026
Size: 49.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.9.16

File hashes

Hashes for agentcents-0.1.22-py3-none-any.whl
Algorithm	Hash digest
SHA256	`366a7a4c5af3a6345e484423baa6bb35e614e77d667cd40b02252f9938d114bc`
MD5	`6c2b2bef36a13eeee8092d602fd91036`
BLAKE2b-256	`6757855ac5b5d40e9c0a07d9748d9f9b089bf189c32d76595df078586c3e162c`

See more details on using hashes here.

agentcents 0.1.22

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

agentcents

Install

What to expect

Quick Start

1. Start the proxy

2. Point your LLM client at the proxy

3. Check your costs

Configuration

Budget behavior

Request Headers

Examples

Response headers

Local Models (Ollama)

CLI Reference

Cost reporting

Live monitoring

Budget alerts

Catalog & models

Intelligence (Pro)

License

Pro Features

Supported Providers

Sync

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes