LLM Cost Analytics for Postgres - Know exactly what your AI features cost

These details have not been verified by PyPI

Project links

Project description

🪙 TokenLedger

Know exactly what your AI features cost, per user, per endpoint, per day.

Note: TokenLedger is in active development (v0.x). The API is stabilizing but may have breaking changes before v1.0. Pin your version in requirements.

TokenLedger is a self-hosted LLM cost analytics solution that runs on your existing Postgres database. Zero external dependencies, complete data ownership, works with Supabase out of the box.

✨ Why TokenLedger?

Every startup building AI features lacks cost attribution:

📊 "Which users are costing us the most?" → No idea
🎯 "What's our cost per feature?" → Can't tell you
🔍 "Which endpoint is burning through tokens?" → Who knows

Existing solutions (Helicone, LangSmith, Langfuse) are either:

SaaS — Your data leaves your infrastructure
Heavy — Require significant setup and infrastructure
Expensive — Per-seat pricing adds up fast

TokenLedger is different:

✅ Postgres-native — Works with your existing database (Supabase, Neon, RDS)
✅ Self-hosted — Your data never leaves your infrastructure
✅ Zero overhead — 2-line integration, async batching
✅ Cost-aware — Automatic cost calculation with up-to-date pricing

🚀 Quick Start

Installation

pip install tokenledger

2-Line Integration

import tokenledger
import openai

# Configure once
tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_openai()

# That's it! All calls are now tracked
response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Every OpenAI call is now automatically logged to your Postgres database with:

Token counts (input, output, cached)
Cost in USD
Latency
Model used
User ID (if provided)
Full request/response metadata

Streaming Support

Streaming calls are also automatically tracked:

# Streaming works seamlessly
for chunk in openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
    stream_options={"include_usage": True}  # Optional: get token counts
):
    print(chunk.choices[0].delta.content or "", end="")
# Event is logged after stream completes

Works with Anthropic too

import tokenledger
import anthropic

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_anthropic()

client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-5-latest",
    messages=[{"role": "user", "content": "Hello!"}]
)

And Google Gemini

import tokenledger
from google import genai

tokenledger.configure(database_url="postgresql://...")
tokenledger.patch_google()

client = genai.Client(api_key="...")
response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Hello!"
)

Cost Attribution

Know exactly who is spending money and which features are driving costs:

from tokenledger import attribution

# Context manager - all calls inside are attributed
with attribution(user_id="user_123", feature="summarize", team="ml"):
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Summarize this..."}]
    )

# Decorator - attribute entire functions
@attribution(feature="chat", cost_center="CC-001")
def handle_chat(user_id: str, message: str):
    with attribution(user_id=user_id):  # Contexts nest and merge
        return client.chat.completions.create(...)

Query your costs by any dimension:

SELECT feature, team, SUM(cost_usd) as cost
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days'
GROUP BY feature, team
ORDER BY cost DESC;

Streaming with Attribution

When using streaming/lazy responses (common with frameworks like pydantic-ai, LangChain), the LLM API call may happen after the context manager exits. Use persistent=True mode:

from tokenledger import attribution, clear_attribution

# Problem: Context exits before stream is consumed
async with attribution(user_id="user123"):
    response = await framework.stream(...)  # Returns lazy response
# Context exits here!
async for chunk in response:  # API call happens here, context is gone!
    yield chunk

# Solution: Use persistent mode
async with attribution(user_id="user123", feature="chat", persistent=True):
    response = await framework.stream(...)

async for chunk in response:  # Context still active!
    yield chunk

clear_attribution()  # Explicitly clear when done

📊 Dashboard

TokenLedger includes a beautiful React dashboard:

# Start with Docker
docker compose up

# Open http://localhost:3000

Or run the API server standalone:

pip install tokenledger[server]
python -m tokenledger.server

🔧 Configuration Options

import tokenledger

tokenledger.configure(
    # Database connection
    database_url="postgresql://user:pass@localhost/db",
    
    # App identification
    app_name="my-app",
    environment="production",
    
    # Performance tuning
    batch_size=100,           # Events per batch write
    flush_interval_seconds=5,  # How often to flush
    async_mode=True,          # Background logging
    
    # Sampling for high-volume apps
    sample_rate=1.0,          # 1.0 = log everything
)

📈 Querying Your Data

Using the Python API

from tokenledger.queries import TokenLedgerQueries

queries = TokenLedgerQueries()

# Get cost summary
summary = queries.get_cost_summary(days=30)
print(f"Last 30 days: ${summary.total_cost:.2f}")
print(f"Total requests: {summary.total_requests}")

# Cost by model
models = queries.get_costs_by_model(days=30)
for m in models:
    print(f"{m.model}: ${m.total_cost:.2f} ({m.total_requests} requests)")

# Cost by user
users = queries.get_costs_by_user(days=30)
for u in users[:5]:
    print(f"{u.user_id}: ${u.total_cost:.2f}")

# Daily trends
daily = queries.get_daily_costs(days=7)
for d in daily:
    print(f"{d.date}: ${d.total_cost:.2f}")

Direct SQL

-- Daily costs by model
SELECT 
    DATE(timestamp) as date,
    model,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY DATE(timestamp), model
ORDER BY date DESC, total_cost DESC;

-- Top 10 users by cost
SELECT 
    user_id,
    SUM(cost_usd) as total_cost,
    COUNT(*) as requests
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '30 days'
GROUP BY user_id
ORDER BY total_cost DESC
LIMIT 10;

-- Projected monthly cost
SELECT 
    (SUM(cost_usd) / 7) * 30 as projected_monthly
FROM token_ledger_events
WHERE timestamp >= NOW() - INTERVAL '7 days';

🔌 Framework Integration

FastAPI

from fastapi import FastAPI
from tokenledger.middleware import FastAPIMiddleware

app = FastAPI()
app.add_middleware(FastAPIMiddleware)

# User ID from X-User-ID header is automatically tracked

Flask

from flask import Flask
from tokenledger.middleware import TokenLedger

app = Flask(__name__)
TokenLedger(app)

Manual Tracking

from tokenledger import track_cost

# Track manually if you need to
track_cost(
    input_tokens=150,
    output_tokens=500,
    model="gpt-4o",
    user_id="user_123",
)

🐘 Supabase Setup

TokenLedger works perfectly with Supabase:

Get your connection string from Supabase Dashboard → Settings → Database
Run the migrations:

DATABASE_URL="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres" tokenledger db init

Configure TokenLedger:

tokenledger.configure(
    database_url="postgresql://postgres:password@db.xxx.supabase.co:5432/postgres"
)

📁 Project Structure

tokenledger/
├── tokenledger/           # Python package
│   ├── __init__.py       # Main exports
│   ├── config.py         # Configuration
│   ├── tracker.py        # Core tracking logic
│   ├── pricing.py        # LLM pricing data
│   ├── queries.py        # Analytics queries
│   ├── decorators.py     # @track_llm decorator
│   ├── middleware.py     # FastAPI/Flask middleware
│   ├── server.py         # Dashboard API server
│   └── interceptors/     # SDK patches
│       ├── openai.py
│       ├── anthropic.py
│       └── google.py
├── dashboard/            # React dashboard
├── migrations/           # SQL migrations
└── examples/             # Usage examples

💰 Supported Models & Pricing

TokenLedger includes up-to-date pricing (January 2026) for 74+ models across 3 providers:

OpenAI (38 text models + audio/image)

Model Family	Input/1M	Output/1M	Notes
GPT-5 (5.2, 5.1, 5, mini, nano)	$0.05-1.75	$0.40-14.00	Cached input support
GPT-5 Pro	$15.00	$120.00	Premium reasoning
GPT-4.1 (4.1, mini, nano)	$0.10-2.00	$0.40-8.00	1M context window
GPT-4o (4o, 4o-mini)	$0.15-2.50	$0.60-10.00	128K context
O-Series (o1, o3, o4-mini)	$1.10-20.00	$4.40-80.00	Reasoning models
Audio (Whisper, TTS)	$0.003-0.012/min	-	Per-minute billing
Images (DALL-E 3, GPT-Image)	$0.04-0.12/image	-	Per-image billing

Anthropic (23 models)

Model Family	Input/1M	Output/1M	Notes
Claude 4.5 (Opus, Sonnet, Haiku)	$1.00-5.00	$5-25	Latest generation
Claude 4 (Opus, Sonnet)	$3.00-15.00	$15-75	Prompt caching
Claude 3.7 (Sonnet)	$3.00	$15.00	Prompt caching
Claude 3.5 (Sonnet, Haiku)	$0.80-3.00	$4-15	Prompt caching
Claude 3 (Opus, Sonnet, Haiku)	$0.25-15.00	$1.25-75	Legacy

Google Gemini (13 models)

Model Family	Input/1M	Output/1M	Notes
Gemini 3 (Pro, Flash preview)	$0.50-2.00	$4-12	Latest preview
Gemini 2.5 (Pro, Flash, Lite)	$0.10-1.25	$0.40-10	Production ready
Gemini 2.0 (Flash, Lite)	$0.075-0.10	$0.30-0.40	Fast inference

Coming Soon

Mistral (pricing data included, interceptor planned)
Custom/self-hosted models

🛠 Development

# Clone the repo
git clone https://github.com/yourusername/tokenledger
cd tokenledger

# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Start local development
docker compose up postgres
python -m tokenledger.server

🗺 Roadmap

Alerts & notifications (budget thresholds)
Cost allocation tags (feature, team, project, cost_center)
Team/project grouping via attribution context
Google Gemini support
OpenAI audio/image API tracking
pydantic-ai framework compatibility
OpenAI streaming support
Anthropic streaming support
Google streaming support
Grafana integration
CLI for querying
More LLM providers (Mistral, Cohere)
TimescaleDB optimization guide

📜 License

TokenLedger is licensed under the Elastic License 2.0 (ELv2).

What this means:

✅ Free to use — Use TokenLedger in your projects, even commercial ones
✅ Modify freely — Fork it, extend it, make it yours
✅ Self-host — Run it on your own infrastructure
❌ No SaaS — You cannot offer TokenLedger as a hosted/managed service

This license protects the project while keeping it free for the community.

🙏 Contributing

Contributions are welcome! Please read our Contributing Guide first.

Built with ❤️ for the AI startup community

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.1

Jan 23, 2026

0.1.0

Jan 23, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenledger-0.1.1.tar.gz (69.8 kB view details)

Uploaded Jan 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tokenledger-0.1.1-py3-none-any.whl (82.6 kB view details)

Uploaded Jan 23, 2026 Python 3

File details

Details for the file tokenledger-0.1.1.tar.gz.

File metadata

Download URL: tokenledger-0.1.1.tar.gz
Upload date: Jan 23, 2026
Size: 69.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tokenledger-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`dc50ebaed02adf40adb2626fe332f3b9aee343812f552e5a1c8c7d1801bc2341`
MD5	`fc37182f455418996f1dd8a74dd9977d`
BLAKE2b-256	`7372bfbe21e1da7af4e632a20be5fd6228ed871daf8293c3fc216262c4c76946`

See more details on using hashes here.

File details

Details for the file tokenledger-0.1.1-py3-none-any.whl.

File metadata

Download URL: tokenledger-0.1.1-py3-none-any.whl
Upload date: Jan 23, 2026
Size: 82.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tokenledger-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e5f7642b6554a101d7794b455bd0170547ded7a69d259e748f3a94e8c16913c1`
MD5	`7a84e3ee3caef08a503feb86b4d516e4`
BLAKE2b-256	`d7a7792cb4557e3bbffb970179707a187ed84bdba4f969e005cd8db6d182264c`

See more details on using hashes here.

tokenledger 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

🪙 TokenLedger

✨ Why TokenLedger?

🚀 Quick Start

Installation

2-Line Integration

Streaming Support

Works with Anthropic too

And Google Gemini

Cost Attribution

Streaming with Attribution

📊 Dashboard

🔧 Configuration Options

📈 Querying Your Data

Using the Python API

Direct SQL

🔌 Framework Integration

FastAPI

Flask

Manual Tracking

🐘 Supabase Setup

📁 Project Structure

💰 Supported Models & Pricing

OpenAI (38 text models + audio/image)

Anthropic (23 models)

Google Gemini (13 models)

Coming Soon

🛠 Development

🗺 Roadmap

📜 License

🙏 Contributing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes