Skip to main content

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls

Project description

CostSentinel

CI PyPI version License: MIT Python 3.9+

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls.

CostSentinel sits between your application and LLM providers, tracking every token spent, enforcing budget policies, and attributing costs to teams, users, and endpoints. Zero external dependencies beyond PyYAML.

Features

  • Token Cost Tracking — Automatic cost calculation for Claude, Titan, and custom models
  • Budget Enforcement — Daily/monthly limits with configurable actions (block, downgrade, alert)
  • Cost Attribution — Track spending by team, user, endpoint, and model
  • Middleware Pattern — Decorator-based interception or manual tracking
  • Reporting — CLI and programmatic cost breakdowns
  • Zero Infrastructure — JSON file storage for development; DynamoDB-ready for production

Installation

pip install substrai-costsentinel

For AWS integration (DynamoDB state backend):

pip install substrai-costsentinel[aws]

For development:

pip install substrai-costsentinel[dev]

Quickstart

1. Initialize Configuration

costsentinel init

This creates costsentinel.yaml with default policies:

project_name: my-project

pricing:
  claude-3.5-sonnet:
    input: 0.003
    output: 0.015
  claude-3-haiku:
    input: 0.00025
    output: 0.00125

policies:
  global:
    limit_daily: 100.0
    limit_monthly: 2000.0
    on_exceed: block
  team:
    limit_daily: 25.0
    on_exceed: downgrade
  user:
    limit_daily: 5.0
    on_exceed: block
    max_cost_per_request: 0.50

2. Add Middleware to Your Code

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

@middleware.intercept(model="claude-3-haiku", user_id="user-1", team_id="engineering")
def call_llm(prompt):
    # Your LLM call here
    response = my_llm_client.complete(prompt)
    return response.text, response.input_tokens, response.output_tokens

# Returns CallResult with cost info
result = call_llm("Summarize this document...")
print(f"Cost: ${result.cost:.4f}, Remaining: ${result.budget_remaining:.2f}")

3. Manual Tracking

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

# After an LLM call completes
result = middleware.track_call(
    model="claude-3.5-sonnet",
    input_tokens=1500,
    output_tokens=800,
    metadata={"user_id": "user-1", "team_id": "engineering", "endpoint": "/api/chat"}
)

4. Check Reports

costsentinel report --today
costsentinel budget status

Architecture

┌─────────────────────────────────────────────────┐
│                  Your Application                │
├─────────────────────────────────────────────────┤
│              CostSentinel Middleware             │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Pricing  │  │  Budget  │  │ Attribution  │  │
│  │ Engine   │  │ Enforcer │  │    Store     │  │
│  └──────────┘  └──────────┘  └──────────────┘  │
│  ┌──────────────────────────────────────────┐   │
│  │            State Management              │   │
│  │     (JSON file / DynamoDB backend)       │   │
│  └──────────────────────────────────────────┘   │
├─────────────────────────────────────────────────┤
│              LLM Provider (Bedrock)              │
└─────────────────────────────────────────────────┘

CLI Commands

Command Description
costsentinel init Create default configuration
costsentinel report --today Show today's cost breakdown
costsentinel budget status Display budget utilization
costsentinel budget reset --scope user --id user-1 Reset a budget counter
costsentinel validate Validate configuration file
costsentinel status Show overall status

Budget Policies

Policies are evaluated from most specific to least specific:

  1. User — Per-user daily/monthly limits
  2. Endpoint — Per-API-endpoint limits
  3. Team — Per-team limits
  4. Global — Organization-wide limits

Actions when budget is exceeded:

  • block — Reject the request with BudgetExceededError
  • downgrade — Signal to use a cheaper model
  • alert — Allow but emit a warning

Development

git clone https://github.com/substrai/costsentinel.git
cd costsentinel
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

MIT — Copyright (c) 2024 Gaurav Kumar Sinha

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrai_costsentinel-0.6.0.tar.gz (60.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

substrai_costsentinel-0.6.0-py3-none-any.whl (56.1 kB view details)

Uploaded Python 3

File details

Details for the file substrai_costsentinel-0.6.0.tar.gz.

File metadata

  • Download URL: substrai_costsentinel-0.6.0.tar.gz
  • Upload date:
  • Size: 60.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for substrai_costsentinel-0.6.0.tar.gz
Algorithm Hash digest
SHA256 0c6a66c17b1cb3c83f61e2d3ed8370d0d68324b5637b80144636bd935c460840
MD5 0d6f80bff363bdf438694d00b820f0c4
BLAKE2b-256 cc1dfe97b215e72955b3b9e730905e4bbf0959a013125270854fc74bf2cd8d5c

See more details on using hashes here.

File details

Details for the file substrai_costsentinel-0.6.0-py3-none-any.whl.

File metadata

File hashes

Hashes for substrai_costsentinel-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 74718e6159a00705a80d1aa9911b0b5203d6940a066c0a9da79a8c653f3c29f3
MD5 9bff52ee453aa9c24cc3a7e30115cd34
BLAKE2b-256 6202a3f3c537990f6ca2fb0e4358c7b5034fd17b0c0c5d198bcf993c5ef0f0c9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page