This package helps you track your llm costs
Project description
a2a-llm-tracker
Track LLM usage and costs across providers (OpenAI, Gemini, Anthropic, etc.) from a single place.
Installation
pip install a2a-llm-tracker
Quick Start (Recommended Pattern)
For applications making multiple LLM calls, use a singleton pattern to initialize once and reuse everywhere.
Step 1: Create a tracking module
Create tracking.py in your project:
# tracking.py
from dotenv import load_dotenv
import os
import asyncio
import concurrent.futures
load_dotenv()
_meter = None
def get_meter():
"""Get or initialize the global meter singleton."""
global _meter
if _meter is None:
try:
from a2a_llm_tracker import init
client_id = os.getenv("CLIENT_ID", "")
client_secret = os.getenv("CLIENT_SECRET", "")
client_server = os.getenv("CLIENT_SERVER", "https://a2aorchestra.com")
with concurrent.futures.ThreadPoolExecutor() as executor:
future = executor.submit(
asyncio.run,
init(client_id, client_secret, "my-app", client_server)
)
_meter = future.result(timeout=5)
except Exception as e:
print(f"LLM tracking initialization failed: {e}")
return None
return _meter
Step 2: Use it anywhere
import os
from openai import OpenAI
from tracking import get_meter
def call_openai(prompt: str):
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": prompt}],
)
# Track usage
try:
from a2a_llm_tracker import analyze_response, ResponseType
meter = get_meter()
agent_id = os.getenv("AGENT_ID") # Add AGENT_ID to your .env file
if meter:
analyze_response(response, ResponseType.OPENAI, meter, agent_id=int(agent_id))
except Exception as e:
print("LLM tracking skipped")
return response
Environment Variables
Set your credentials in .env file or export them:
CLIENT_ID=your_client_id
CLIENT_SECRET=your_client_secret
CLIENT_SERVER=https://a2aorchestra.com # optional, this is the default
AGENT_ID=my-agent # optional, for tracking which agent made the call
OPENAI_API_KEY=sk-xxxxx
Query Total Usage & Costs
Retrieve your accumulated costs and token usage from CCS:
import os
import asyncio
from a2a_llm_tracker import init
from a2a_llm_tracker.sources import CCSSource
async def get_total_usage():
client_id = os.getenv("CLIENT_ID")
client_secret = os.getenv("CLIENT_SECRET")
await init(
client_id=client_id,
client_secret=client_secret,
application_name="my-app",
)
source = CCSSource(int(client_id))
total_cost = await source.count_cost()
total_tokens = await source.count_total_tokens()
print(f"Total cost: ${total_cost:.4f}")
print(f"Total tokens: {total_tokens}")
asyncio.run(get_total_usage())
Request Tracking (Multiple LLM Calls per Request)
Track multiple LLM calls as a single request using set_request_id and set_session_id. These work with any framework - no Starlette required.
Basic Usage (Any Framework)
from a2a_llm_tracker import set_request_id, set_session_id, generate_id
def handle_request():
# Set at the start of each request - all LLM calls get these IDs automatically
set_request_id(generate_id())
set_session_id("user-session-123")
# All LLM calls anywhere in this request share the same IDs
step_one()
step_two()
step_three()
Flask
from flask import Flask, request
from a2a_llm_tracker import set_request_id, set_session_id, generate_id
app = Flask(__name__)
@app.before_request
def before_request():
set_request_id(request.headers.get("X-Request-ID") or generate_id())
set_session_id(request.headers.get("X-Session-ID") or generate_id())
Django
# middleware.py
from a2a_llm_tracker import set_request_id, set_session_id, generate_id
class LLMTrackerMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
set_request_id(request.headers.get("X-Request-ID") or generate_id())
set_session_id(request.headers.get("X-Session-ID") or generate_id())
return self.get_response(request)
FastAPI/Starlette (Optional)
If you have Starlette installed, you can use the built-in middleware:
from fastapi import FastAPI
from a2a_llm_tracker import TrackerMiddleware
app = FastAPI()
app.add_middleware(TrackerMiddleware)
TrackerMiddleware now also reads and propagates call-lineage headers when present:
X-Call-IDX-Parent-Call-IDX-Sequence-Number
Calling Backend Proxy Agents With Tracking IDs
Use call_agent to call your backend agent URL and automatically forward the current tracking IDs:
X-Request-IDX-Session-IDX-Trace-ID
from a2a_llm_tracker import call_agent, set_request_id, set_session_id, set_context
set_request_id("req-123")
set_session_id("sess-123")
set_context(trace_id="trace-123")
response = call_agent(
"https://my-backend-agent.example.com/run",
payload={"task": "summarize this"},
)
Async usage:
from a2a_llm_tracker import call_agent_async
response = await call_agent_async(
"https://my-backend-agent.example.com/run",
payload={"task": "summarize this"},
)
High-Level A2A/Orchestra Agent Call (Recommended)
Use call_orchestra_agent when you just want to pass text (or parts) and let the SDK:
- build JSON-RPC payload
- handle
message/sendvsmessage/stream - parse response text
- persist returned tracking headers
- auto-generate call lineage headers (
X-Call-ID,X-Parent-Call-ID,X-Sequence-Number) - automatically send bearer auth using CCS token when available (or
user_tokenoverride)
from a2a_llm_tracker import call_orchestra_agent
result = call_orchestra_agent(
input_message="Summarize this contract in 3 bullet points.",
agent_url="https://node.a2aorchestra.com/api/v1/agents/sendmessage/123/.well-known/agent-card.json",
stream=False,
debug=True, # include request headers + payload in result (auth redacted)
)
print(result["success"], result["text"])
print(result["trace_id"], result["session_id"], result["request_id"])
print(result["call_id"], result["parent_call_id"], result["sequence_number"])
Streaming mode (SSE parsed automatically):
result = call_orchestra_agent(
agent_id="123", # optional if agent_url already includes agent path
input_message="Write a short poem.",
stream=True,
)
print(result["text"]) # aggregated text
print(len(result["events"])) # parsed SSE events
Override parent linkage manually (optional):
result = call_orchestra_agent(
agent_id="123",
input_message="Next step",
parent_call_id="my-parent-call-id",
)
Multimodal call (text/image/video parts) by passing parts directly:
result = call_orchestra_agent(
agent_id="123",
parts=[
{"kind": "text", "text": "Describe this image"},
{"kind": "image", "url": "https://example.com/image.jpg"},
],
stream=False,
)
Google ADK Integration
Track LLM usage in Google Agent Development Kit (ADK) agents using the built-in callback:
from google.adk.agents import LlmAgent
from a2a_llm_tracker import create_adk_callback
from tracking import get_meter
meter = get_meter()
agent = LlmAgent(
name="my_agent",
model="gemini-2.0-flash",
instruction="You are a helpful assistant.",
after_model_callback=create_adk_callback(
meter=meter,
agent_id=123, # Your agent concept ID (integer)
),
)
The callback automatically extracts token usage from ADK's LlmResponse.usage_metadata and records it to CCS.
Supported Providers
| Provider | ResponseType |
|---|---|
| OpenAI | ResponseType.OPENAI |
| Google Gemini | ResponseType.GEMINI |
| Anthropic | ResponseType.ANTHROPIC |
| Cohere | ResponseType.COHERE |
| Mistral | ResponseType.MISTRAL |
| Groq | ResponseType.GROQ |
| Together AI | ResponseType.TOGETHER |
| AWS Bedrock | ResponseType.BEDROCK |
| Google Vertex AI | ResponseType.VERTEX |
| Google ADK | ResponseType.ADK |
Documentation
Full documentation available on GitHub:
- LiteLLM Wrapper - Auto-tracking via LiteLLM
- CCS Integration - Centralized tracking setup
- Response Analysis - Direct SDK tracking
- Pricing - Custom pricing configuration
- Building - Development and publishing
What This Package Does NOT Do
- Guess exact billing from raw text
- Replace provider SDKs
- Upload data anywhere automatically
- Require a backend or SaaS
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file a2a_llm_tracker-0.0.26.tar.gz.
File metadata
- Download URL: a2a_llm_tracker-0.0.26.tar.gz
- Upload date:
- Size: 51.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
062b2bfca46078b643aad8f76838a6cc2107cdd3ec34e776186f855492872d95
|
|
| MD5 |
c9062d75fa164577090002e2bcc12ad6
|
|
| BLAKE2b-256 |
dd771e88f49be3b64be417b080379e68ad9960555dcdef7fbe85afe61d5da956
|
File details
Details for the file a2a_llm_tracker-0.0.26-py3-none-any.whl.
File metadata
- Download URL: a2a_llm_tracker-0.0.26-py3-none-any.whl
- Upload date:
- Size: 52.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
05091c05d8482a1fa6c683f29ee716f7762cbc6206ee2d727c29c8f3b8684965
|
|
| MD5 |
63ab2de9e65f7dd6a957aba1489b68d3
|
|
| BLAKE2b-256 |
fc298be25bca93d024e9bbdc045ed3db7a1f4e9a84667ce17d721beca13dc07c
|