LLM Cost Analytics for Postgres - Know exactly what your AI features cost
Project description
๐ช TokenLedger
Know exactly what your AI features cost, per user, per endpoint, per day.
Note: TokenLedger is in active development (v0.x). The API is stabilizing but may have breaking changes before v1.0. Pin your version in requirements.
TokenLedger is a self-hosted LLM cost analytics solution that runs on your existing Postgres database. Zero external dependencies, complete data ownership, works with Supabase out of the box.
โจ Why TokenLedger?
Every startup building AI features lacks cost attribution:
- ๐ "Which users are costing us the most?" โ No idea
- ๐ฏ "What's our cost per feature?" โ Can't tell you
- ๐ "Which endpoint is burning through tokens?" โ Who knows
Existing solutions (Helicone, LangSmith, Langfuse) are either:
- SaaS โ Your data leaves your infrastructure
- Heavy โ Require significant setup and infrastructure
- Expensive โ Per-seat pricing adds up fast
TokenLedger is different:
- โ Postgres-native โ Works with your existing database (Supabase, Neon, RDS)
- โ Self-hosted โ Your data never leaves your infrastructure
- โ Zero overhead โ 2-line integration, async batching
- โ Cost-aware โ Automatic cost calculation with up-to-date pricing
๐ Quick Start
Installation
pip install tokenledger
2-Line Integration
import tokenledger
import openai
# Configure once
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_openai()
# That's it! All calls are now tracked
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
Every OpenAI call is now automatically logged to your Postgres database with:
- Token counts (input, output, cached)
- Cost in USD
- Latency
- Model used
- User ID (if provided)
- Full request/response metadata
Streaming Support
Streaming calls are also automatically tracked:
# Streaming works seamlessly
for chunk in openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
stream_options={"include_usage": True} # Optional: get token counts
):
print(chunk.choices[0].delta.content or "", end="")
# Event is logged after stream completes
Works with Anthropic too
import tokenledger
import anthropic
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_anthropic()
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-5-latest",
messages=[{"role": "user", "content": "Hello!"}]
)
And Google Gemini
import tokenledger
from google import genai
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_google()
client = genai.Client(api_key="...")
response = client.models.generate_content(
model="gemini-2.5-flash",
contents="Hello!"
)
Cost Attribution
Know exactly who is spending money and which features are driving costs:
from tokenledger import attribution
# Context manager - all calls inside are attributed
with attribution(user_id="user_123", feature="summarize", team="ml"):
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this..."}]
)
# Decorator - attribute entire functions
@attribution(feature="chat", cost_center="CC-001")
def handle_chat(user_id: str, message: str):
with attribution(user_id=user_id): # Contexts nest and merge
return client.chat.completions.create(...)
Query your costs by any dimension:
SELECT feature, team, SUM(cost_usd) as cost
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days'
GROUP BY feature, team
ORDER BY cost DESC;
Streaming with Attribution
When using streaming/lazy responses (common with frameworks like pydantic-ai, LangChain),
the LLM API call may happen after the context manager exits. Use persistent=True mode:
from tokenledger import attribution, clear_attribution
# Problem: Context exits before stream is consumed
async with attribution(user_id="user123"):
response = await framework.stream(...) # Returns lazy response
# Context exits here!
async for chunk in response: # API call happens here, context is gone!
yield chunk
# Solution: Use persistent mode
async with attribution(user_id="user123", feature="chat", persistent=True):
response = await framework.stream(...)
async for chunk in response: # Context still active!
yield chunk
clear_attribution() # Explicitly clear when done
๐ Dashboard
TokenLedger includes a beautiful React dashboard:
# Start with Docker
docker compose up
# Open http://localhost:3000
Or run the API server standalone:
pip install tokenledger[server]
python -m tokenledger.server
๐ง Configuration Options
import tokenledger
tokenledger.configure(
# Database connection
database_url="postgresql://user:pass@localhost/db",
# App identification
app_name="my-app",
environment="production",
# Performance tuning
batch_size=100, # Events per batch write
flush_interval_seconds=5, # How often to flush
async_mode=True, # Background logging
# Sampling for high-volume apps
sample_rate=1.0, # 1.0 = log everything
)
๐ Querying Your Data
Using the Python API
from tokenledger.queries import TokenLedgerQueries
queries = TokenLedgerQueries()
# Get cost summary
summary = queries.get_cost_summary(days=30)
print(f"Last 30 days: ${summary.total_cost:.2f}")
print(f"Total requests: {summary.total_requests}")
# Cost by model
models = queries.get_costs_by_model(days=30)
for m in models:
print(f"{m.model}: ${m.total_cost:.2f} ({m.total_requests} requests)")
# Cost by user
users = queries.get_costs_by_user(days=30)
for u in users[:5]:
print(f"{u.user_id}: ${u.total_cost:.2f}")
# Daily trends
daily = queries.get_daily_costs(days=7)
for d in daily:
print(f"{d.date}: ${d.total_cost:.2f}")
Direct SQL
-- Daily costs by model
SELECT
DATE(timestamp) as date,
model,
SUM(cost_usd) as total_cost,
COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY DATE(timestamp), model
ORDER BY date DESC, total_cost DESC;
-- Top 10 users by cost
SELECT
user_id,
SUM(cost_usd) as total_cost,
COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY user_id
ORDER BY total_cost DESC
LIMIT 10;
-- Projected monthly cost
SELECT
(SUM(cost_usd) / 7) * 30 as projected_monthly
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days';
๐ Framework Integration
FastAPI
from fastapi import FastAPI
from tokenledger.middleware import FastAPIMiddleware
app = FastAPI()
app.add_middleware(FastAPIMiddleware)
# User ID from X-User-ID header is automatically tracked
Flask
from flask import Flask
from tokenledger.middleware import TokenLedger
app = Flask(__name__)
TokenLedger(app)
Manual Tracking
from tokenledger import track_cost
# Track manually if you need to
track_cost(
input_tokens=150,
output_tokens=500,
model="gpt-4o",
user_id="user_123",
)
๐ Supabase Setup
TokenLedger works perfectly with Supabase:
-
Get your connection string from Supabase Dashboard โ Settings โ Database
-
Run the migrations:
DATABASE_URL="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres" tokenledger db init
- Configure TokenLedger:
tokenledger.configure(
database_url="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres"
)
๐ Project Structure
tokenledger/
โโโ tokenledger/ # Python package
โ โโโ __init__.py # Main exports
โ โโโ config.py # Configuration
โ โโโ tracker.py # Core tracking logic
โ โโโ pricing.py # LLM pricing data
โ โโโ queries.py # Analytics queries
โ โโโ decorators.py # @track_llm decorator
โ โโโ middleware.py # FastAPI/Flask middleware
โ โโโ server.py # Dashboard API server
โ โโโ interceptors/ # SDK patches
โ โโโ openai.py
โ โโโ anthropic.py
โ โโโ google.py
โโโ dashboard/ # React dashboard
โโโ migrations/ # SQL migrations
โโโ examples/ # Usage examples
๐ฐ Supported Models & Pricing
TokenLedger includes up-to-date pricing (January 2026) for 74+ models across 3 providers:
OpenAI (38 text models + audio/image)
| Model Family | Input/1M | Output/1M | Notes |
|---|---|---|---|
| GPT-5 (5.2, 5.1, 5, mini, nano) | $0.05-1.75 | $0.40-14.00 | Cached input support |
| GPT-5 Pro | $15.00 | $120.00 | Premium reasoning |
| GPT-4.1 (4.1, mini, nano) | $0.10-2.00 | $0.40-8.00 | 1M context window |
| GPT-4o (4o, 4o-mini) | $0.15-2.50 | $0.60-10.00 | 128K context |
| O-Series (o1, o3, o4-mini) | $1.10-20.00 | $4.40-80.00 | Reasoning models |
| Audio (Whisper, TTS) | $0.003-0.012/min | - | Per-minute billing |
| Images (DALL-E 3, GPT-Image) | $0.04-0.12/image | - | Per-image billing |
Anthropic (23 models)
| Model Family | Input/1M | Output/1M | Notes |
|---|---|---|---|
| Claude 4.5 (Opus, Sonnet, Haiku) | $1.00-5.00 | $5-25 | Latest generation |
| Claude 4 (Opus, Sonnet) | $3.00-15.00 | $15-75 | Prompt caching |
| Claude 3.7 (Sonnet) | $3.00 | $15.00 | Prompt caching |
| Claude 3.5 (Sonnet, Haiku) | $0.80-3.00 | $4-15 | Prompt caching |
| Claude 3 (Opus, Sonnet, Haiku) | $0.25-15.00 | $1.25-75 | Legacy |
Google Gemini (13 models)
| Model Family | Input/1M | Output/1M | Notes |
|---|---|---|---|
| Gemini 3 (Pro, Flash preview) | $0.50-2.00 | $4-12 | Latest preview |
| Gemini 2.5 (Pro, Flash, Lite) | $0.10-1.25 | $0.40-10 | Production ready |
| Gemini 2.0 (Flash, Lite) | $0.075-0.10 | $0.30-0.40 | Fast inference |
Coming Soon
- Mistral (pricing data included, interceptor planned)
- Custom/self-hosted models
๐ Development
# Clone the repo
git clone https://github.com/yourusername/tokenledger
cd tokenledger
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Start local development
docker compose up postgres
python -m tokenledger.server
๐บ Roadmap
- Alerts & notifications (budget thresholds)
- Cost allocation tags (feature, team, project, cost_center)
- Team/project grouping via attribution context
- Google Gemini support
- OpenAI audio/image API tracking
- pydantic-ai framework compatibility
- OpenAI streaming support
- Anthropic streaming support
- Google streaming support
- Grafana integration
- CLI for querying
- More LLM providers (Mistral, Cohere)
- TimescaleDB optimization guide
๐ License
TokenLedger is licensed under the Elastic License 2.0 (ELv2).
What this means:
- โ Free to use โ Use TokenLedger in your projects, even commercial ones
- โ Modify freely โ Fork it, extend it, make it yours
- โ Self-host โ Run it on your own infrastructure
- โ No SaaS โ You cannot offer TokenLedger as a hosted/managed service
This license protects the project while keeping it free for the community.
๐ Contributing
Contributions are welcome! Please read our Contributing Guide first.
Built with โค๏ธ for the AI startup community
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tokenledger-0.1.1.tar.gz.
File metadata
- Download URL: tokenledger-0.1.1.tar.gz
- Upload date:
- Size: 69.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc50ebaed02adf40adb2626fe332f3b9aee343812f552e5a1c8c7d1801bc2341
|
|
| MD5 |
fc37182f455418996f1dd8a74dd9977d
|
|
| BLAKE2b-256 |
7372bfbe21e1da7af4e632a20be5fd6228ed871daf8293c3fc216262c4c76946
|
File details
Details for the file tokenledger-0.1.1-py3-none-any.whl.
File metadata
- Download URL: tokenledger-0.1.1-py3-none-any.whl
- Upload date:
- Size: 82.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e5f7642b6554a101d7794b455bd0170547ded7a69d259e748f3a94e8c16913c1
|
|
| MD5 |
7a84e3ee3caef08a503feb86b4d516e4
|
|
| BLAKE2b-256 |
d7a7792cb4557e3bbffb970179707a187ed84bdba4f969e005cd8db6d182264c
|