Your AI copilot for LLM costs - Modern, plugin-based cost tracking for LangChain, LangGraph, and LlamaIndex with 5 usage patterns

These details have not been verified by PyPI

Project links

Project description

token-copilot

Your AI copilot for LLM costs - Modern, plugin-based cost tracking for LangChain, LangGraph, and LlamaIndex

🚀 What is token-copilot?

A lightweight, production-ready library for tracking and optimizing LLM costs. Track costs in real-time, enforce budgets, analyze usage patterns, and export data for reporting - all with minimal configuration.

Key Benefits:

🎯 Zero Config - Start tracking with 2 lines of code
💰 Budget Control - Automatic budget enforcement and alerts
📊 Multi-Tenant - Track costs by user, org, session, or any dimension
🔌 Plugin-Based - Add features only when needed
🌐 Framework Agnostic - Works with LangChain, LangGraph, LlamaIndex
☁️ Azure OpenAI - Full support with automatic cost calculation

Verified Working:

✅ 12/12 core tests passing with Azure OpenAI
✅ Production-tested with real workloads
✅ 19+ supported LLM models (OpenAI, Anthropic, Ollama)

⚡ Quick Start

pip install token-copilot

from token_copilot import TokenCoPilot
from langchain_openai import ChatOpenAI

# Create copilot with budget
copilot = TokenCoPilot(budget_limit=10.00)
llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[copilot])

# Use normally
result = llm.invoke("What is Python?")

# Get metrics
print(f"Cost: ${copilot.cost:.4f}")
print(f"Tokens: {copilot.tokens:,}")
print(f"Remaining: ${copilot.get_remaining_budget():.2f}")

That's it! You're now tracking costs and enforcing budgets.

📦 Installation

Basic (Core Features)

pip install token-copilot

With Analytics

pip install token-copilot[analytics]

With Streaming

pip install token-copilot[streaming]

All Features

pip install token-copilot[all]

✨ Features

Core Features

Feature	Description
Cost Tracking	Automatic token and cost tracking for all LLM calls
Budget Enforcement	Hard stops at budget limits with configurable actions
Multi-Tenant	Track costs by user, organization, session, or custom dimensions
DataFrame Export	Export to pandas for advanced analytics and reporting
19+ Models	Support for OpenAI, Anthropic, Azure OpenAI, and Ollama models
Real-time Stats	Get total cost, tokens, averages, and remaining budget instantly

Optional Plugins

Plugin	Description
Persistence	Save cost history to SQLite or JSON for long-term tracking
Analytics	Detect waste, anomalies, and efficiency issues automatically
Streaming	Stream events to Webhook, Kafka, Syslog, or OpenTelemetry
Routing	Intelligent model selection based on cost and quality
Adaptive	Auto-adjust parameters based on remaining budget
Forecasting	Predict when budget will be exhausted

Framework Support

✅ LangChain - Full support via callbacks
✅ LangGraph - Works with graph-based workflows
✅ LlamaIndex - Dedicated callback handler
✅ Azure OpenAI - Automatic model detection and cost calculation

🎯 Usage Patterns

Choose the style that fits your needs:

1. Minimal (Simplest)

from token_copilot import TokenCoPilot
from langchain_openai import ChatOpenAI

copilot = TokenCoPilot(budget_limit=10.00)
llm = ChatOpenAI(callbacks=[copilot])
result = llm.invoke("Hello!")
print(f"Cost: ${copilot.cost:.4f}")

2. Builder (Fluent API)

copilot = (TokenCoPilot(budget_limit=100.00)
    .with_persistence(backend=SQLiteBackend("costs.db"))
    .with_analytics(detect_anomalies=True)
    .with_adaptive()
    .build()
)
llm = ChatOpenAI(callbacks=[copilot])

3. Factory Presets

from token_copilot.presets import production

copilot = production(
    budget_limit=1000.00,
    webhook_url="https://monitoring.example.com",
    detect_anomalies=True
)
llm = ChatOpenAI(callbacks=[copilot])

4. Context Manager

from token_copilot import track_costs

with track_costs(budget_limit=5.00) as copilot:
    llm = ChatOpenAI(callbacks=[copilot])
    result = llm.invoke("Hello!")
    print(f"Cost: ${copilot.cost:.4f}")
# Automatic summary on exit

5. Decorator

from token_copilot.decorators import track_cost

@track_cost(budget_limit=5.00)
def process_text(text):
    llm = ChatOpenAI(callbacks=[process_text.copilot])
    return llm.invoke(f"Process: {text}")

result = process_text("my text")
print(f"Cost: ${process_text.copilot.cost:.4f}")

📖 Core Examples

Budget Enforcement

from token_copilot import TokenCoPilot, BudgetExceededError

copilot = TokenCoPilot(
    budget_limit=1.00,
    on_budget_exceeded="raise"  # Options: "raise", "warn", "ignore"
)

llm = ChatOpenAI(callbacks=[copilot])

try:
    result = llm.invoke("Expensive task...")
except BudgetExceededError as e:
    print(f"Budget exceeded: {e}")

Multi-Tenant Tracking

copilot = TokenCoPilot(budget_limit=100.00)
llm = ChatOpenAI(callbacks=[copilot])

# Track per user
result = llm.invoke(
    "Hello",
    config={
        "metadata": {
            "user_id": "user_123",
            "org_id": "org_456",
            "session_id": "session_789"
        }
    }
)

# Get costs by dimension
user_costs = copilot.tracker.get_costs_by("user_id")
org_costs = copilot.tracker.get_costs_by("org_id")

print(f"User user_123: ${user_costs['user_123']:.4f}")
print(f"Org org_456: ${org_costs['org_456']:.4f}")

DataFrame Export & Analytics

import pandas as pd

copilot = TokenCoPilot()
llm = ChatOpenAI(callbacks=[copilot])

# Make calls...
for i in range(100):
    result = llm.invoke(f"Task {i}")

# Export to DataFrame
df = copilot.to_dataframe()

# Analyze
print(df.groupby('user_id')['cost'].sum())
print(df.groupby('model')['cost'].mean())

# Time series
hourly_costs = df.resample('H')['cost'].sum()

# Save reports
df.to_csv('llm_costs.csv')
df.to_excel('llm_costs.xlsx')

Statistics & Metrics

copilot = TokenCoPilot(budget_limit=50.00)
llm = ChatOpenAI(callbacks=[copilot])

# Make some calls...

# Get statistics
stats = copilot.get_stats()
print(f"Total Calls: {stats['total_calls']}")
print(f"Total Cost: ${stats['total_cost']:.4f}")
print(f"Total Tokens: {stats['total_tokens']:,}")
print(f"Avg Cost/Call: ${stats['avg_cost_per_call']:.4f}")
print(f"Avg Tokens/Call: {stats['avg_tokens_per_call']:.1f}")

# Check remaining budget
remaining = copilot.get_remaining_budget()
print(f"Remaining: ${remaining:.2f}")

🔌 Plugin Examples

Persistence (Save History)

from token_copilot import TokenCoPilot
from token_copilot.plugins import SQLiteBackend

# SQLite backend (production-ready)
backend = SQLiteBackend(db_path="costs.db")
copilot = (TokenCoPilot(budget_limit=100.00)
    .with_persistence(backend=backend, session_id="session_123")
)

llm = ChatOpenAI(callbacks=[copilot])
response = llm.invoke("Hello!")

# Query historical data
plugin = copilot._plugin_manager.get_plugins()[0]
summary = plugin.get_summary()
print(f"Total cost: ${summary['total_cost']:.2f}")
print(f"Total calls: {summary['total_calls']}")

# Get recent events
from datetime import datetime, timedelta
events = plugin.get_events(
    start_time=datetime.now() - timedelta(hours=24)
)

Analytics (Detect Issues)

copilot = (TokenCoPilot(budget_limit=100.00)
    .with_analytics(detect_anomalies=True, track_waste=True)
)

llm = ChatOpenAI(callbacks=[copilot])

# Make calls...
for i in range(50):
    result = llm.invoke(f"Task {i}")

# Get analytics
from token_copilot.plugins.analytics import AnalyticsPlugin
analytics = copilot._plugin_manager.get_plugins(AnalyticsPlugin)[0]

# Check for anomalies
anomalies = analytics.get_anomalies(minutes=60)
for anomaly in anomalies:
    print(f"[{anomaly.severity}] {anomaly.message}")

Adaptive (Budget-Based Adjustments)

copilot = (TokenCoPilot(budget_limit=100.00)
    .with_adaptive()
)

from token_copilot.plugins.adaptive import AdaptivePlugin
adaptive = copilot._plugin_manager.get_plugins(AdaptivePlugin)[0]

# Get current budget tier
tier_info = adaptive.get_tier_info()
print(f"Budget tier: {tier_info['tier_name']}")  # abundant, comfortable, constrained, critical
print(f"Remaining: ${tier_info['remaining']:.2f}")

# Operations automatically adjust based on tier
# - abundant: high quality, max tokens
# - comfortable: balanced
# - constrained: conservative
# - critical: minimal usage

🌐 Azure OpenAI Support

Full support with automatic cost calculation:

from token_copilot import TokenCoPilot
from langchain_openai import AzureChatOpenAI
import os

# Configure Azure OpenAI
llm = AzureChatOpenAI(
    azure_deployment="gpt-4o-mini",
    api_version="2024-02-15-preview",
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_API_KEY")
)

# Use with token-copilot (costs tracked automatically)
copilot = TokenCoPilot(budget_limit=10.00)
response = llm.invoke("Hello!", config={"callbacks": [copilot]})

print(f"Cost: ${copilot.cost:.6f}")
print(f"Tokens: {copilot.tokens:,}")

Environment Setup (.env file):

AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_VERSION=2024-02-15-preview
AZURE_OPENAI_CHAT_DEPLOYMENT=gpt-4o-mini

Supported Models:

✅ gpt-4o-mini (all versions)
✅ gpt-4o (all versions)
✅ gpt-4-turbo (all versions)
✅ gpt-3.5-turbo (all versions)

🎯 LangGraph Support

Works seamlessly with LangGraph workflows:

from langgraph.graph import StateGraph, START
from langchain_openai import ChatOpenAI
from token_copilot import TokenCoPilot

copilot = TokenCoPilot(budget_limit=10.00)

# Create graph
builder = StateGraph(State)
builder.add_node("agent", agent_node)
builder.add_edge(START, "agent")
graph = builder.compile()

# Run with cost tracking
result = graph.invoke(
    {"messages": [("user", "Hello")]},
    config={"callbacks": [copilot]}
)

print(f"Total cost: ${copilot.cost:.4f}")
print(f"Total tokens: {copilot.tokens:,}")

📚 API Reference

TokenCoPilot Class

copilot = TokenCoPilot(
    budget_limit=100.00,           # Optional: Budget limit in USD
    budget_period="total",         # "total", "daily", "monthly", "per_user", "per_org"
    on_budget_exceeded="raise"     # "raise", "warn", "ignore"
)

Properties:

copilot.cost - Total cost in USD
copilot.tokens - Total tokens used
copilot.budget_limit - Current budget limit

Core Methods:

get_total_cost() - Get total cost
get_total_tokens() - Get total tokens
get_stats() - Get summary statistics (dict)
get_remaining_budget(metadata=None) - Get remaining budget
to_dataframe() - Export to pandas DataFrame

Builder Methods:

.with_persistence(backend, session_id) - Add persistence
.with_analytics(detect_anomalies=True) - Add analytics
.with_streaming(webhook_url=...) - Add streaming
.with_adaptive() - Add adaptive operations
.with_forecasting(forecast_hours=48) - Add forecasting
.build() - Finalize (optional)

Pricing Utilities

from token_copilot import get_model_config, calculate_cost, list_models

# Get model configuration
config = get_model_config("gpt-4o-mini")
print(config.input_cost_per_1m)   # Cost per 1M input tokens
print(config.output_cost_per_1m)  # Cost per 1M output tokens

# Calculate cost
cost = calculate_cost("gpt-4o-mini", input_tokens=1000, output_tokens=500)
print(f"Cost: ${cost:.6f}")

# List all supported models
models = list_models()  # Returns 19+ model IDs

Direct Tracker Usage (Without LangChain)

from token_copilot.tracking import MultiTenantTracker

tracker = MultiTenantTracker()
entry = tracker.track(
    model="gpt-4o-mini",
    input_tokens=100,
    output_tokens=50,
    metadata={"user_id": "user_123"}
)

print(f"Cost: ${entry.cost:.6f}")
print(f"Total: ${tracker.get_total_cost():.6f}")

🏭 Factory Presets

Pre-configured setups for common scenarios:

from token_copilot.presets import basic, development, production, enterprise

# Basic - Just cost tracking
copilot = basic(budget_limit=10.00)

# Development - With logging and anomaly detection
copilot = development(budget_limit=50.00, detect_anomalies=True)

# Production - Full monitoring with alerts
copilot = production(
    budget_limit=1000.00,
    webhook_url="https://monitoring.example.com",
    detect_anomalies=True,
    enable_forecasting=True
)

# Enterprise - All features enabled
copilot = enterprise(
    budget_limit=10000.00,
    kafka_brokers=["kafka:9092"],
    otlp_endpoint="http://collector:4318",
    enable_all=True
)

🔍 Real-World Example

Complete chatbot with cost tracking:

from token_copilot import TokenCoPilot, BudgetExceededError
from langchain_openai import ChatOpenAI

def chatbot():
    copilot = TokenCoPilot(budget_limit=5.00, on_budget_exceeded="warn")
    llm = ChatOpenAI(model="gpt-4o-mini", callbacks=[copilot])

    print("Chatbot started! (type 'quit' to exit)")

    while True:
        user_input = input("\nYou: ")
        if user_input.lower() == 'quit':
            break

        # Check budget
        if copilot.get_remaining_budget() <= 0:
            print("Budget exhausted!")
            break

        try:
            response = llm.invoke(user_input)
            print(f"Bot: {response.content}")
            print(f"Cost this turn: ${copilot.tracker.get_last_cost():.6f}")
        except BudgetExceededError:
            print("Budget limit reached!")
            break

    # Final stats
    stats = copilot.get_stats()
    print(f"\n📊 Session Summary:")
    print(f"  Total turns: {stats['total_calls']}")
    print(f"  Total cost: ${stats['total_cost']:.4f}")
    print(f"  Avg cost/turn: ${stats['avg_cost_per_call']:.4f}")

if __name__ == "__main__":
    chatbot()

❓ FAQ

Q: Does this work with streaming responses? A: Currently tracks costs after completion. Streaming support coming in v1.0.3.

Q: Can I use without LangChain? A: Yes! Use MultiTenantTracker directly (see API Reference above).

Q: How accurate is the cost tracking? A: Uses official pricing from OpenAI and Anthropic. Updated regularly. 100% accurate for supported models.

Q: Which usage pattern should I use? A:

Getting started: Minimal or Factory presets
Production: Builder or Production preset
Reusable code: Decorators or Context managers

Q: Can I create custom plugins? A: Yes! Extend the Plugin base class:

from token_copilot.core import Plugin

class MyPlugin(Plugin):
    def on_cost_tracked(self, model, tokens, cost, metadata):
        print(f"Custom logic: ${cost:.6f}")

copilot = TokenCoPilot()
copilot.add_plugin(MyPlugin())

📚 Documentation

Full Documentation

📖 Complete Documentation (Single Page HTML) - Open in your browser for comprehensive guide with all features, examples, and API reference.

🤝 Contributing

Contributions are welcome! We appreciate all contributions, from bug reports to new features.

How to Contribute

Report Bugs - Open an issue with details
Suggest Features - Share your ideas in discussions
Submit PRs - Fork, create a branch, and submit a pull request
Improve Docs - Help make documentation better
Share Examples - Contribute real-world usage examples

Development Setup

# Clone the repository
git clone https://github.com/scionoftech/token-copilot.git
cd token-copilot

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
python test_core.py

Guidelines

Write clear commit messages
Add tests for new features
Update documentation as needed
Follow existing code style
Be respectful and constructive

All contributions, big or small, are appreciated!

🔗 Links

GitHub: https://github.com/scionoftech/token-copilot
PyPI: https://pypi.org/project/token-copilot/
Issues: https://github.com/scionoftech/token-copilot/issues
Documentation: DOCUMENTATION.html

📝 License

MIT License - see LICENSE file.

🙏 Acknowledgments

Built with ❤️ by Sai Kumar Yava

If you find this useful, please ⭐ star the repo!

🚀 Quick Start Checklist

Install: pip install token-copilot
Import: from token_copilot import TokenCoPilot
Create: copilot = TokenCoPilot(budget_limit=10.00)
Use: llm = ChatOpenAI(callbacks=[copilot])
Track: print(f"Cost: ${copilot.cost:.4f}")
You're done! 🎉

Need help? Open an issue or check the documentation.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.0.2

Dec 5, 2025

1.0.1

Nov 27, 2025

1.0.0

Nov 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

token_copilot-1.0.2.tar.gz (134.9 kB view details)

Uploaded Dec 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

token_copilot-1.0.2-py3-none-any.whl (99.6 kB view details)

Uploaded Dec 5, 2025 Python 3

File details

Details for the file token_copilot-1.0.2.tar.gz.

File metadata

Download URL: token_copilot-1.0.2.tar.gz
Upload date: Dec 5, 2025
Size: 134.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for token_copilot-1.0.2.tar.gz
Algorithm	Hash digest
SHA256	`989dcd6c22f142d009aff706144e7178d56f52217fb6749fe4b129b5a5249049`
MD5	`c5c87709268359aeeab7c1a0165184ab`
BLAKE2b-256	`f349bc31e59ef09d0a0112b6344c78f8c2ccc2495af068cecbf87d97cc5fc297`

See more details on using hashes here.

File details

Details for the file token_copilot-1.0.2-py3-none-any.whl.

File metadata

Download URL: token_copilot-1.0.2-py3-none-any.whl
Upload date: Dec 5, 2025
Size: 99.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for token_copilot-1.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`680478b42be71d2ae94c975e94c73951d743a86b046e287fa1b793dbe6e5d3b6`
MD5	`6698f614fd87df9bff6247ba2d37e1ce`
BLAKE2b-256	`94e359a12dd8381548ff8b090b9371069f3e757962400b2a4f527befb1200d6d`

See more details on using hashes here.

token-copilot 1.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

token-copilot

🚀 What is token-copilot?

⚡ Quick Start

📦 Installation

Basic (Core Features)

With Analytics

With Streaming

All Features

✨ Features

Core Features

Optional Plugins

Framework Support

🎯 Usage Patterns

1. Minimal (Simplest)

2. Builder (Fluent API)

3. Factory Presets

4. Context Manager

5. Decorator

📖 Core Examples

Budget Enforcement

Multi-Tenant Tracking

DataFrame Export & Analytics

Statistics & Metrics

🔌 Plugin Examples

Persistence (Save History)

Analytics (Detect Issues)

Adaptive (Budget-Based Adjustments)

🌐 Azure OpenAI Support

🎯 LangGraph Support

📚 API Reference

TokenCoPilot Class

Pricing Utilities

Direct Tracker Usage (Without LangChain)

🏭 Factory Presets

🔍 Real-World Example

❓ FAQ

📚 Documentation

Full Documentation

🤝 Contributing

How to Contribute

Development Setup

Guidelines

All contributions, big or small, are appreciated!

🔗 Links

📝 License

🙏 Acknowledgments

🚀 Quick Start Checklist

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes