Skip to main content

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls

Project description

CostSentinel

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls.

CostSentinel sits between your application and LLM providers, tracking every token spent, enforcing budget policies, and attributing costs to teams, users, and endpoints. Zero external dependencies beyond PyYAML.

Features

  • Token Cost Tracking — Automatic cost calculation for Claude, Titan, and custom models
  • Budget Enforcement — Daily/monthly limits with configurable actions (block, downgrade, alert)
  • Cost Attribution — Track spending by team, user, endpoint, and model
  • Middleware Pattern — Decorator-based interception or manual tracking
  • Reporting — CLI and programmatic cost breakdowns
  • Zero Infrastructure — JSON file storage for development; DynamoDB-ready for production

Installation

pip install substrai-costsentinel

For AWS integration (DynamoDB state backend):

pip install substrai-costsentinel[aws]

For development:

pip install substrai-costsentinel[dev]

Quickstart

1. Initialize Configuration

costsentinel init

This creates costsentinel.yaml with default policies:

project_name: my-project

pricing:
  claude-3.5-sonnet:
    input: 0.003
    output: 0.015
  claude-3-haiku:
    input: 0.00025
    output: 0.00125

policies:
  global:
    limit_daily: 100.0
    limit_monthly: 2000.0
    on_exceed: block
  team:
    limit_daily: 25.0
    on_exceed: downgrade
  user:
    limit_daily: 5.0
    on_exceed: block
    max_cost_per_request: 0.50

2. Add Middleware to Your Code

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

@middleware.intercept(model="claude-3-haiku", user_id="user-1", team_id="engineering")
def call_llm(prompt):
    # Your LLM call here
    response = my_llm_client.complete(prompt)
    return response.text, response.input_tokens, response.output_tokens

# Returns CallResult with cost info
result = call_llm("Summarize this document...")
print(f"Cost: ${result.cost:.4f}, Remaining: ${result.budget_remaining:.2f}")

3. Manual Tracking

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

# After an LLM call completes
result = middleware.track_call(
    model="claude-3.5-sonnet",
    input_tokens=1500,
    output_tokens=800,
    metadata={"user_id": "user-1", "team_id": "engineering", "endpoint": "/api/chat"}
)

4. Check Reports

costsentinel report --today
costsentinel budget status

Architecture

┌─────────────────────────────────────────────────┐
│                  Your Application                │
├─────────────────────────────────────────────────┤
│              CostSentinel Middleware             │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Pricing  │  │  Budget  │  │ Attribution  │  │
│  │ Engine   │  │ Enforcer │  │    Store     │  │
│  └──────────┘  └──────────┘  └──────────────┘  │
│  ┌──────────────────────────────────────────┐   │
│  │            State Management              │   │
│  │     (JSON file / DynamoDB backend)       │   │
│  └──────────────────────────────────────────┘   │
├─────────────────────────────────────────────────┤
│              LLM Provider (Bedrock)              │
└─────────────────────────────────────────────────┘

CLI Commands

Command Description
costsentinel init Create default configuration
costsentinel report --today Show today's cost breakdown
costsentinel budget status Display budget utilization
costsentinel budget reset --scope user --id user-1 Reset a budget counter
costsentinel validate Validate configuration file
costsentinel status Show overall status

Budget Policies

Policies are evaluated from most specific to least specific:

  1. User — Per-user daily/monthly limits
  2. Endpoint — Per-API-endpoint limits
  3. Team — Per-team limits
  4. Global — Organization-wide limits

Actions when budget is exceeded:

  • block — Reject the request with BudgetExceededError
  • downgrade — Signal to use a cheaper model
  • alert — Allow but emit a warning

Development

git clone https://github.com/substrai/costsentinel.git
cd costsentinel
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

MIT — Copyright (c) 2024 Gaurav Kumar Sinha

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrai_costsentinel-0.3.0.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

substrai_costsentinel-0.3.0-py3-none-any.whl (37.5 kB view details)

Uploaded Python 3

File details

Details for the file substrai_costsentinel-0.3.0.tar.gz.

File metadata

  • Download URL: substrai_costsentinel-0.3.0.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for substrai_costsentinel-0.3.0.tar.gz
Algorithm Hash digest
SHA256 82a4087468bf640a2c1f426d6c110c9db4ebf15536de410da55b8355c0625086
MD5 eece966c28e7fe97e509748856e5c12d
BLAKE2b-256 67779a917923fccf308b00d9edbc52061eef91c261a62db4d811e7d91fce98d1

See more details on using hashes here.

File details

Details for the file substrai_costsentinel-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for substrai_costsentinel-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 635468e92f90126287d404c3b7437f3206c9b76fabaede566987b814fc516ab4
MD5 2a13c2e34b3207738c1084f38fecc801
BLAKE2b-256 08993b0b9bf2a621778611f336532c1d73aa4889b56df45513fe5b8340302e89

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page