Skip to main content

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls

Project description

CostSentinel

CI PyPI version License: MIT Python 3.9+

AI cost governance middleware — budget enforcement, attribution, and reporting for LLM API calls.

CostSentinel sits between your application and LLM providers, tracking every token spent, enforcing budget policies, and attributing costs to teams, users, and endpoints. Zero external dependencies beyond PyYAML.

Features

  • Token Cost Tracking — Automatic cost calculation for Claude, Titan, and custom models
  • Budget Enforcement — Daily/monthly limits with configurable actions (block, downgrade, alert)
  • Cost Attribution — Track spending by team, user, endpoint, and model
  • Middleware Pattern — Decorator-based interception or manual tracking
  • Reporting — CLI and programmatic cost breakdowns
  • Zero Infrastructure — JSON file storage for development; DynamoDB-ready for production

Installation

pip install substrai-costsentinel

For AWS integration (DynamoDB state backend):

pip install substrai-costsentinel[aws]

For development:

pip install substrai-costsentinel[dev]

Quickstart

1. Initialize Configuration

costsentinel init

This creates costsentinel.yaml with default policies:

project_name: my-project

pricing:
  claude-3.5-sonnet:
    input: 0.003
    output: 0.015
  claude-3-haiku:
    input: 0.00025
    output: 0.00125

policies:
  global:
    limit_daily: 100.0
    limit_monthly: 2000.0
    on_exceed: block
  team:
    limit_daily: 25.0
    on_exceed: downgrade
  user:
    limit_daily: 5.0
    on_exceed: block
    max_cost_per_request: 0.50

2. Add Middleware to Your Code

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

@middleware.intercept(model="claude-3-haiku", user_id="user-1", team_id="engineering")
def call_llm(prompt):
    # Your LLM call here
    response = my_llm_client.complete(prompt)
    return response.text, response.input_tokens, response.output_tokens

# Returns CallResult with cost info
result = call_llm("Summarize this document...")
print(f"Cost: ${result.cost:.4f}, Remaining: ${result.budget_remaining:.2f}")

3. Manual Tracking

from costsentinel import CostMiddleware

middleware = CostMiddleware("costsentinel.yaml")

# After an LLM call completes
result = middleware.track_call(
    model="claude-3.5-sonnet",
    input_tokens=1500,
    output_tokens=800,
    metadata={"user_id": "user-1", "team_id": "engineering", "endpoint": "/api/chat"}
)

4. Check Reports

costsentinel report --today
costsentinel budget status

Architecture

┌─────────────────────────────────────────────────┐
│                  Your Application                │
├─────────────────────────────────────────────────┤
│              CostSentinel Middleware             │
│  ┌──────────┐  ┌──────────┐  ┌──────────────┐  │
│  │ Pricing  │  │  Budget  │  │ Attribution  │  │
│  │ Engine   │  │ Enforcer │  │    Store     │  │
│  └──────────┘  └──────────┘  └──────────────┘  │
│  ┌──────────────────────────────────────────┐   │
│  │            State Management              │   │
│  │     (JSON file / DynamoDB backend)       │   │
│  └──────────────────────────────────────────┘   │
├─────────────────────────────────────────────────┤
│              LLM Provider (Bedrock)              │
└─────────────────────────────────────────────────┘

CLI Commands

Command Description
costsentinel init Create default configuration
costsentinel report --today Show today's cost breakdown
costsentinel budget status Display budget utilization
costsentinel budget reset --scope user --id user-1 Reset a budget counter
costsentinel validate Validate configuration file
costsentinel status Show overall status

Budget Policies

Policies are evaluated from most specific to least specific:

  1. User — Per-user daily/monthly limits
  2. Endpoint — Per-API-endpoint limits
  3. Team — Per-team limits
  4. Global — Organization-wide limits

Actions when budget is exceeded:

  • block — Reject the request with BudgetExceededError
  • downgrade — Signal to use a cheaper model
  • alert — Allow but emit a warning

Development

git clone https://github.com/substrai/costsentinel.git
cd costsentinel
python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
pytest

License

MIT — Copyright (c) 2024 Gaurav Kumar Sinha

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

substrai_costsentinel-0.6.1.tar.gz (62.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

substrai_costsentinel-0.6.1-py3-none-any.whl (56.8 kB view details)

Uploaded Python 3

File details

Details for the file substrai_costsentinel-0.6.1.tar.gz.

File metadata

  • Download URL: substrai_costsentinel-0.6.1.tar.gz
  • Upload date:
  • Size: 62.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for substrai_costsentinel-0.6.1.tar.gz
Algorithm Hash digest
SHA256 e8f72f34ae9f89b687edb6ed5defcbee8b1a9239d475d3283e789e87f1cc6a0a
MD5 1b046ee15d16fce9c51a88db70b45ddc
BLAKE2b-256 a84f9d9bfbee4350389c502ee8e2254056c66f089fa0becdc65e3c0d14a51cfe

See more details on using hashes here.

File details

Details for the file substrai_costsentinel-0.6.1-py3-none-any.whl.

File metadata

File hashes

Hashes for substrai_costsentinel-0.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 39775915b4ed2c1bfea4390ce592cd13f7aa5cb8288eb4f7cf48e99ec78242ee
MD5 fa041647653fede7c8fedced56830b30
BLAKE2b-256 11af8c3f6df22c4b19dd08c7a9e701d6824b62ed82c5a0210da2dd6d74165a32

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page