Skip to main content

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls

Project description

CostSentinel

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls.

CostSentinel sits between your application and LLM providers, tracking every token spent, enforcing budget policies, and attributing costs to teams, users, and endpoints. Zero external dependencies beyond PyYAML.

Features

  • Token Cost Tracking — Automatic cost calculation for Claude, Titan, and custom models
  • Budget Enforcement — Daily/monthly limits with configurable actions (block, downgrade, alert)
  • Cost Attribution — Track spending by team, user, endpoint, and model
  • Middleware Pattern — Decorator-based interception or manual tracking
  • Reporting — CLI and programmatic cost breakdowns
  • Zero Infrastructure — JSON file storage for development; DynamoDB-ready for production

Installation

pip install substrai-costsentinel

For AWS integration (DynamoDB state backend):

pip install substrai-costsentinel[aws]

For development:

pip install substrai-costsentinel[dev]

Quickstart

1. Initialize Configuration

costsentinel init

This creates costsentinel.yaml with default policies:

project_name: my-project

pricing:
  claude-3.5-sonnet:
    input: 0.003
    output: 0.015
  claude-3-haiku:
    input: 0.00025
    output: 0.00125

policies:
  global:
    limit_daily: 100.0
    limit_monthly: 2000.0
    on_exceed: block
  team:
    limit_daily: 25.0
    on_exceed: downgrade
  user:
    limit_daily: 5.0
    on_exceed: block
    max_cost_per_request: 0.50

2. Add Middleware to Your Code

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

@middleware.intercept(model="claude-3-haiku", user_id="user-1", team_id="engineering")
def call_llm(prompt):
    # Your LLM call here
    response = my_llm_client.complete(prompt)
    return response.text, response.input_tokens, response.output_tokens

# Returns CallResult with cost info
result = call_llm("Summarize this document...")
print(f"Cost: ${result.cost:.4f}, Remaining: ${result.budget_remaining:.2f}")

3. Manual Tracking

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

# After an LLM call completes
result = middleware.track_call(
    model="claude-3.5-sonnet",
    input_tokens=1500,
    output_tokens=800,
    metadata={"user_id": "user-1", "team_id": "engineering", "endpoint": "/api/chat"}
)

4. Check Reports

costsentinel report --today
costsentinel budget status

Architecture

┌─────────────────────────────────────────────────┐
│                  Your Application                │
├─────────────────────────────────────────────────┤
│              CostSentinel Middleware             │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Pricing  │  │  Budget  │  │ Attribution  │  │
│  │ Engine   │  │ Enforcer │  │    Store     │  │
│  └──────────┘  └──────────┘  └──────────────┘  │
│  ┌──────────────────────────────────────────┐   │
│  │            State Management              │   │
│  │     (JSON file / DynamoDB backend)       │   │
│  └──────────────────────────────────────────┘   │
├─────────────────────────────────────────────────┤
│              LLM Provider (Bedrock)              │
└─────────────────────────────────────────────────┘

CLI Commands

Command Description
costsentinel init Create default configuration
costsentinel report --today Show today's cost breakdown
costsentinel budget status Display budget utilization
costsentinel budget reset --scope user --id user-1 Reset a budget counter
costsentinel validate Validate configuration file
costsentinel status Show overall status

Budget Policies

Policies are evaluated from most specific to least specific:

  1. User — Per-user daily/monthly limits
  2. Endpoint — Per-API-endpoint limits
  3. Team — Per-team limits
  4. Global — Organization-wide limits

Actions when budget is exceeded:

  • block — Reject the request with BudgetExceededError
  • downgrade — Signal to use a cheaper model
  • alert — Allow but emit a warning

Development

git clone https://github.com/substrai/costsentinel.git
cd costsentinel
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

MIT — Copyright (c) 2024 Gaurav Kumar Sinha

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrai_costsentinel-0.5.0.tar.gz (57.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

substrai_costsentinel-0.5.0-py3-none-any.whl (52.7 kB view details)

Uploaded Python 3

File details

Details for the file substrai_costsentinel-0.5.0.tar.gz.

File metadata

  • Download URL: substrai_costsentinel-0.5.0.tar.gz
  • Upload date:
  • Size: 57.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for substrai_costsentinel-0.5.0.tar.gz
Algorithm Hash digest
SHA256 3c623440b6f12374ff798b5bcb08a4d987fdbc9cf5a53fadd0a429387aef5e54
MD5 839554f687075283f87bd3a5e916a1d9
BLAKE2b-256 c3a12980745aba398cd0241522be80a3c345022cc6f2feeea2559c83ffd24739

See more details on using hashes here.

File details

Details for the file substrai_costsentinel-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for substrai_costsentinel-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3f956e9be9fb00c3c94646431541281ce24c9063486e89079f6c462748ff1cf0
MD5 03ef0710f16e58963e722381aa31a7e2
BLAKE2b-256 a9fed0705da9e344bc273ec089c11b8c33933d0ade0ea1c773dbf227c635797a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page