Skip to main content

LLM Cost Analytics for Postgres - Know exactly what your AI features cost

Project description

๐Ÿช™ TokenLedger

Know exactly what your AI features cost, per user, per endpoint, per day.

CI codecov PyPI version License: ELv2 Python 3.11+

Note: TokenLedger is in active development (v0.x). The API is stabilizing but may have breaking changes before v1.0. Pin your version in requirements.

TokenLedger is a self-hosted LLM cost analytics solution that runs on your existing Postgres database. Zero external dependencies, complete data ownership, works with Supabase out of the box.

โœจ Why TokenLedger?

Every startup building AI features lacks cost attribution:

  • ๐Ÿ“Š "Which users are costing us the most?" โ†’ No idea
  • ๐ŸŽฏ "What's our cost per feature?" โ†’ Can't tell you
  • ๐Ÿ” "Which endpoint is burning through tokens?" โ†’ Who knows

Existing solutions (Helicone, LangSmith, Langfuse) are either:

  • SaaS โ€” Your data leaves your infrastructure
  • Heavy โ€” Require significant setup and infrastructure
  • Expensive โ€” Per-seat pricing adds up fast

TokenLedger is different:

  • โœ… Postgres-native โ€” Works with your existing database (Supabase, Neon, RDS)
  • โœ… Self-hosted โ€” Your data never leaves your infrastructure
  • โœ… Zero overhead โ€” 2-line integration, async batching
  • โœ… Cost-aware โ€” Automatic cost calculation with up-to-date pricing

๐Ÿš€ Quick Start

Installation

pip install tokenledger

2-Line Integration

import tokenledger
import openai

# Configure once
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_openai()

# That's it! All calls are now tracked
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Every OpenAI call is now automatically logged to your Postgres database with:

  • Token counts (input, output, cached)
  • Cost in USD
  • Latency
  • Model used
  • User ID (if provided)
  • Full request/response metadata

Streaming Support

Streaming calls are also automatically tracked:

# Streaming works seamlessly
for chunk in openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
    stream_options={"include_usage": True}  # Optional: get token counts
):
    print(chunk.choices[0].delta.content or "", end="")
# Event is logged after stream completes

Works with Anthropic too

import tokenledger
import anthropic

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_anthropic()

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-5-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)

And Google Gemini

import tokenledger
from google import genai

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_google()

client = genai.Client(api_key="...")
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Hello!"
)

Cost Attribution

Know exactly who is spending money and which features are driving costs:

from tokenledger import attribution

# Context manager - all calls inside are attributed
with attribution(user_id="user_123", feature="summarize", team="ml"):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize this..."}]
    )

# Decorator - attribute entire functions
@attribution(feature="chat", cost_center="CC-001")
def handle_chat(user_id: str, message: str):
    with attribution(user_id=user_id):  # Contexts nest and merge
        return client.chat.completions.create(...)

Query your costs by any dimension:

SELECT feature, team, SUM(cost_usd) as cost
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days'
GROUP BY feature, team
ORDER BY cost DESC;

Streaming with Attribution

When using streaming/lazy responses (common with frameworks like pydantic-ai, LangChain), the LLM API call may happen after the context manager exits. Use persistent=True mode:

from tokenledger import attribution, clear_attribution

# Problem: Context exits before stream is consumed
async with attribution(user_id="user123"):
    response = await framework.stream(...)  # Returns lazy response
# Context exits here!
async for chunk in response:  # API call happens here, context is gone!
    yield chunk

# Solution: Use persistent mode
async with attribution(user_id="user123", feature="chat", persistent=True):
    response = await framework.stream(...)

async for chunk in response:  # Context still active!
    yield chunk

clear_attribution()  # Explicitly clear when done

๐Ÿ“Š Dashboard

TokenLedger includes a beautiful React dashboard:

# Start with Docker
docker compose up

# Open http://localhost:3000

Or run the API server standalone:

pip install tokenledger[server]
python -m tokenledger.server

๐Ÿ”ง Configuration Options

import tokenledger

tokenledger.configure(
    # Database connection
    database_url="postgresql://user:pass@localhost/db",
    
    # App identification
    app_name="my-app",
    environment="production",
    
    # Performance tuning
    batch_size=100,           # Events per batch write
    flush_interval_seconds=5,  # How often to flush
    async_mode=True,          # Background logging
    
    # Sampling for high-volume apps
    sample_rate=1.0,          # 1.0 = log everything
)

๐Ÿ“ˆ Querying Your Data

Using the Python API

from tokenledger.queries import TokenLedgerQueries

queries = TokenLedgerQueries()

# Get cost summary
summary = queries.get_cost_summary(days=30)
print(f"Last 30 days: ${summary.total_cost:.2f}")
print(f"Total requests: {summary.total_requests}")

# Cost by model
models = queries.get_costs_by_model(days=30)
for m in models:
    print(f"{m.model}: ${m.total_cost:.2f} ({m.total_requests} requests)")

# Cost by user
users = queries.get_costs_by_user(days=30)
for u in users[:5]:
    print(f"{u.user_id}: ${u.total_cost:.2f}")

# Daily trends
daily = queries.get_daily_costs(days=7)
for d in daily:
    print(f"{d.date}: ${d.total_cost:.2f}")

Direct SQL

-- Daily costs by model
SELECT 
    DATE(timestamp) as date,
    model,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY DATE(timestamp), model
ORDER BY date DESC, total_cost DESC;

-- Top 10 users by cost
SELECT 
    user_id,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY user_id
ORDER BY total_cost DESC
LIMIT 10;

-- Projected monthly cost
SELECT 
    (SUM(cost_usd) / 7) * 30 as projected_monthly
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days';

๐Ÿ”Œ Framework Integration

FastAPI

from fastapi import FastAPI
from tokenledger.middleware import FastAPIMiddleware

app = FastAPI()
app.add_middleware(FastAPIMiddleware)

# User ID from X-User-ID header is automatically tracked

Flask

from flask import Flask
from tokenledger.middleware import TokenLedger

app = Flask(__name__)
TokenLedger(app)

Manual Tracking

from tokenledger import track_cost

# Track manually if you need to
track_cost(
    input_tokens=150,
    output_tokens=500,
    model="gpt-4o",
    user_id="user_123",
)

๐Ÿ˜ Supabase Setup

TokenLedger works perfectly with Supabase:

  1. Get your connection string from Supabase Dashboard โ†’ Settings โ†’ Database

  2. Run the migrations:

DATABASE_URL="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres" tokenledger db init
  1. Configure TokenLedger:
tokenledger.configure(
    database_url="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres"
)

๐Ÿ“ Project Structure

tokenledger/
โ”œโ”€โ”€ tokenledger/           # Python package
โ”‚   โ”œโ”€โ”€ __init__.py       # Main exports
โ”‚   โ”œโ”€โ”€ config.py         # Configuration
โ”‚   โ”œโ”€โ”€ tracker.py        # Core tracking logic
โ”‚   โ”œโ”€โ”€ pricing.py        # LLM pricing data
โ”‚   โ”œโ”€โ”€ queries.py        # Analytics queries
โ”‚   โ”œโ”€โ”€ decorators.py     # @track_llm decorator
โ”‚   โ”œโ”€โ”€ middleware.py     # FastAPI/Flask middleware
โ”‚   โ”œโ”€โ”€ server.py         # Dashboard API server
โ”‚   โ””โ”€โ”€ interceptors/     # SDK patches
โ”‚       โ”œโ”€โ”€ openai.py
โ”‚       โ”œโ”€โ”€ anthropic.py
โ”‚       โ””โ”€โ”€ google.py
โ”œโ”€โ”€ dashboard/            # React dashboard
โ”œโ”€โ”€ migrations/           # SQL migrations
โ””โ”€โ”€ examples/             # Usage examples

๐Ÿ’ฐ Supported Models & Pricing

TokenLedger includes up-to-date pricing (January 2026) for 74+ models across 3 providers:

OpenAI (38 text models + audio/image)

Model Family Input/1M Output/1M Notes
GPT-5 (5.2, 5.1, 5, mini, nano) $0.05-1.75 $0.40-14.00 Cached input support
GPT-5 Pro $15.00 $120.00 Premium reasoning
GPT-4.1 (4.1, mini, nano) $0.10-2.00 $0.40-8.00 1M context window
GPT-4o (4o, 4o-mini) $0.15-2.50 $0.60-10.00 128K context
O-Series (o1, o3, o4-mini) $1.10-20.00 $4.40-80.00 Reasoning models
Audio (Whisper, TTS) $0.003-0.012/min - Per-minute billing
Images (DALL-E 3, GPT-Image) $0.04-0.12/image - Per-image billing

Anthropic (23 models)

Model Family Input/1M Output/1M Notes
Claude 4.5 (Opus, Sonnet, Haiku) $1.00-5.00 $5-25 Latest generation
Claude 4 (Opus, Sonnet) $3.00-15.00 $15-75 Prompt caching
Claude 3.7 (Sonnet) $3.00 $15.00 Prompt caching
Claude 3.5 (Sonnet, Haiku) $0.80-3.00 $4-15 Prompt caching
Claude 3 (Opus, Sonnet, Haiku) $0.25-15.00 $1.25-75 Legacy

Google Gemini (13 models)

Model Family Input/1M Output/1M Notes
Gemini 3 (Pro, Flash preview) $0.50-2.00 $4-12 Latest preview
Gemini 2.5 (Pro, Flash, Lite) $0.10-1.25 $0.40-10 Production ready
Gemini 2.0 (Flash, Lite) $0.075-0.10 $0.30-0.40 Fast inference

Coming Soon

  • Mistral (pricing data included, interceptor planned)
  • Custom/self-hosted models

๐Ÿ›  Development

# Clone the repo
git clone https://github.com/yourusername/tokenledger
cd tokenledger

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Start local development
docker compose up postgres
python -m tokenledger.server

๐Ÿ—บ Roadmap

  • Alerts & notifications (budget thresholds)
  • Cost allocation tags (feature, team, project, cost_center)
  • Team/project grouping via attribution context
  • Google Gemini support
  • OpenAI audio/image API tracking
  • pydantic-ai framework compatibility
  • OpenAI streaming support
  • Anthropic streaming support
  • Google streaming support
  • Grafana integration
  • CLI for querying
  • More LLM providers (Mistral, Cohere)
  • TimescaleDB optimization guide

๐Ÿ“œ License

TokenLedger is licensed under the Elastic License 2.0 (ELv2).

What this means:

  • โœ… Free to use โ€” Use TokenLedger in your projects, even commercial ones
  • โœ… Modify freely โ€” Fork it, extend it, make it yours
  • โœ… Self-host โ€” Run it on your own infrastructure
  • โŒ No SaaS โ€” You cannot offer TokenLedger as a hosted/managed service

This license protects the project while keeping it free for the community.

๐Ÿ™ Contributing

Contributions are welcome! Please read our Contributing Guide first.


Built with โค๏ธ for the AI startup community

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenledger-0.1.1.tar.gz (69.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tokenledger-0.1.1-py3-none-any.whl (82.6 kB view details)

Uploaded Python 3

File details

Details for the file tokenledger-0.1.1.tar.gz.

File metadata

  • Download URL: tokenledger-0.1.1.tar.gz
  • Upload date:
  • Size: 69.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tokenledger-0.1.1.tar.gz
Algorithm Hash digest
SHA256 dc50ebaed02adf40adb2626fe332f3b9aee343812f552e5a1c8c7d1801bc2341
MD5 fc37182f455418996f1dd8a74dd9977d
BLAKE2b-256 7372bfbe21e1da7af4e632a20be5fd6228ed871daf8293c3fc216262c4c76946

See more details on using hashes here.

File details

Details for the file tokenledger-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: tokenledger-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 82.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tokenledger-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e5f7642b6554a101d7794b455bd0170547ded7a69d259e748f3a94e8c16913c1
MD5 7a84e3ee3caef08a503feb86b4d516e4
BLAKE2b-256 d7a7792cb4557e3bbffb970179707a187ed84bdba4f969e005cd8db6d182264c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page