AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls
Project description
CostSentinel
AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls.
CostSentinel sits between your application and LLM providers, tracking every token spent, enforcing budget policies, and attributing costs to teams, users, and endpoints. Zero external dependencies beyond PyYAML.
Features
- Token Cost Tracking — Automatic cost calculation for Claude, Titan, and custom models
- Budget Enforcement — Daily/monthly limits with configurable actions (block, downgrade, alert)
- Cost Attribution — Track spending by team, user, endpoint, and model
- Middleware Pattern — Decorator-based interception or manual tracking
- Reporting — CLI and programmatic cost breakdowns
- Zero Infrastructure — JSON file storage for development; DynamoDB-ready for production
Installation
pip install substrai-costsentinel
For AWS integration (DynamoDB state backend):
pip install substrai-costsentinel[aws]
For development:
pip install substrai-costsentinel[dev]
Quickstart
1. Initialize Configuration
costsentinel init
This creates costsentinel.yaml with default policies:
project_name: my-project
pricing:
claude-3.5-sonnet:
input: 0.003
output: 0.015
claude-3-haiku:
input: 0.00025
output: 0.00125
policies:
global:
limit_daily: 100.0
limit_monthly: 2000.0
on_exceed: block
team:
limit_daily: 25.0
on_exceed: downgrade
user:
limit_daily: 5.0
on_exceed: block
max_cost_per_request: 0.50
2. Add Middleware to Your Code
from costsentinel import CostMiddleware
middleware = CostMiddleware("costsentinel.yaml")
@middleware.intercept(model="claude-3-haiku", user_id="user-1", team_id="engineering")
def call_llm(prompt):
# Your LLM call here
response = my_llm_client.complete(prompt)
return response.text, response.input_tokens, response.output_tokens
# Returns CallResult with cost info
result = call_llm("Summarize this document...")
print(f"Cost: ${result.cost:.4f}, Remaining: ${result.budget_remaining:.2f}")
3. Manual Tracking
from costsentinel import CostMiddleware
middleware = CostMiddleware("costsentinel.yaml")
# After an LLM call completes
result = middleware.track_call(
model="claude-3.5-sonnet",
input_tokens=1500,
output_tokens=800,
metadata={"user_id": "user-1", "team_id": "engineering", "endpoint": "/api/chat"}
)
4. Check Reports
costsentinel report --today
costsentinel budget status
Architecture
┌─────────────────────────────────────────────────┐
│ Your Application │
├─────────────────────────────────────────────────┤
│ CostSentinel Middleware │
│ ┌──────────┐ ┌──────────┐ ┌──────────────┐ │
│ │ Pricing │ │ Budget │ │ Attribution │ │
│ │ Engine │ │ Enforcer │ │ Store │ │
│ └──────────┘ └──────────┘ └──────────────┘ │
│ ┌──────────────────────────────────────────┐ │
│ │ State Management │ │
│ │ (JSON file / DynamoDB backend) │ │
│ └──────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│ LLM Provider (Bedrock) │
└─────────────────────────────────────────────────┘
CLI Commands
| Command | Description |
|---|---|
costsentinel init |
Create default configuration |
costsentinel report --today |
Show today's cost breakdown |
costsentinel budget status |
Display budget utilization |
costsentinel budget reset --scope user --id user-1 |
Reset a budget counter |
costsentinel validate |
Validate configuration file |
costsentinel status |
Show overall status |
Budget Policies
Policies are evaluated from most specific to least specific:
- User — Per-user daily/monthly limits
- Endpoint — Per-API-endpoint limits
- Team — Per-team limits
- Global — Organization-wide limits
Actions when budget is exceeded:
block— Reject the request withBudgetExceededErrordowngrade— Signal to use a cheaper modelalert— Allow but emit a warning
Development
git clone https://github.com/substrai/costsentinel.git
cd costsentinel
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest
License
MIT — Copyright (c) 2024 Gaurav Kumar Sinha
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file substrai_costsentinel-0.5.1.tar.gz.
File metadata
- Download URL: substrai_costsentinel-0.5.1.tar.gz
- Upload date:
- Size: 57.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0670a95a3ea0c2820d2fd880506a38bea672397ae3ee584c243b60af7f9bc1e7
|
|
| MD5 |
07470e25489001173672b844699d0b0b
|
|
| BLAKE2b-256 |
a9171221e3eb43c3810c5301290a54a0de048f7913da864656f344214349fbe0
|
File details
Details for the file substrai_costsentinel-0.5.1-py3-none-any.whl.
File metadata
- Download URL: substrai_costsentinel-0.5.1-py3-none-any.whl
- Upload date:
- Size: 52.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
457f691e23af40ee2ccfc9bd999eb2a39023ef5a6935754faceab9c22fc4f83c
|
|
| MD5 |
2c09a6eff01025751bbdf39a37f8519e
|
|
| BLAKE2b-256 |
7139285a7a2a2ff67005d4616f7712316490b728d2ef7536cc597435f8bd9a2d
|