This package helps you track your llm costs

These details have not been verified by PyPI

Project description

a2a-llm-tracker

Track LLM usage and costs across providers (OpenAI, Gemini, Anthropic, etc.) from a single place.

Installation

pip install a2a-llm-tracker

Quick Start (Recommended Pattern)

For applications making multiple LLM calls, use a singleton pattern to initialize once and reuse everywhere.

Step 1: Create a tracking module

Create tracking.py in your project:

# tracking.py
from dotenv import load_dotenv
import os
import asyncio
import concurrent.futures

load_dotenv()

_meter = None

def get_meter():
    """Get or initialize the global meter singleton."""
    global _meter
    if _meter is None:
        try:
            from a2a_llm_tracker import init

            client_id = os.getenv("CLIENT_ID", "")
            client_secret = os.getenv("CLIENT_SECRET", "")
            client_server = os.getenv("CLIENT_SERVER", "https://a2aorchestra.com")

            with concurrent.futures.ThreadPoolExecutor() as executor:
                future = executor.submit(
                    asyncio.run,
                    init(client_id, client_secret, "my-app", client_server)
                )
                _meter = future.result(timeout=5)

        except Exception as e:
            print(f"LLM tracking initialization failed: {e}")
            return None
    return _meter

Step 2: Use it anywhere

import os
from openai import OpenAI
from tracking import get_meter

def call_openai(prompt: str):
    client = OpenAI()
    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
    )

    # Track usage
    try:
        from a2a_llm_tracker import analyze_response, ResponseType

        meter = get_meter()
        agent_id = os.getenv("AGENT_ID")  # Add AGENT_ID to your .env file

        if meter:
            analyze_response(response, ResponseType.OPENAI, meter, agent_id=int(agent_id))
    except Exception as e:
        print("LLM tracking skipped")

    return response

Environment Variables

Set your credentials in .env file or export them:

CLIENT_ID=your_client_id
CLIENT_SECRET=your_client_secret
CLIENT_SERVER=https://a2aorchestra.com  # optional, this is the default
AGENT_ID=my-agent  # optional, for tracking which agent made the call
OPENAI_API_KEY=sk-xxxxx

Query Total Usage & Costs

Retrieve your accumulated costs and token usage from CCS:

import os
import asyncio
from a2a_llm_tracker import init
from a2a_llm_tracker.sources import CCSSource

async def get_total_usage():
    client_id = os.getenv("CLIENT_ID")
    client_secret = os.getenv("CLIENT_SECRET")

    await init(
        client_id=client_id,
        client_secret=client_secret,
        application_name="my-app",
    )

    source = CCSSource(int(client_id))
    total_cost = await source.count_cost()
    total_tokens = await source.count_total_tokens()

    print(f"Total cost: ${total_cost:.4f}")
    print(f"Total tokens: {total_tokens}")

asyncio.run(get_total_usage())

Request Tracking (Multiple LLM Calls per Request)

Track multiple LLM calls as a single request using set_request_id and set_session_id. These work with any framework - no Starlette required.

Basic Usage (Any Framework)

from a2a_llm_tracker import set_request_id, set_session_id, generate_id

def handle_request():
    # Set at the start of each request - all LLM calls get these IDs automatically
    set_request_id(generate_id())
    set_session_id("user-session-123")

    # All LLM calls anywhere in this request share the same IDs
    step_one()
    step_two()
    step_three()

Flask

from flask import Flask, request
from a2a_llm_tracker import set_request_id, set_session_id, generate_id

app = Flask(__name__)

@app.before_request
def before_request():
    set_request_id(request.headers.get("X-Request-ID") or generate_id())
    set_session_id(request.headers.get("X-Session-ID") or generate_id())

Django

# middleware.py
from a2a_llm_tracker import set_request_id, set_session_id, generate_id

class LLMTrackerMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        set_request_id(request.headers.get("X-Request-ID") or generate_id())
        set_session_id(request.headers.get("X-Session-ID") or generate_id())
        return self.get_response(request)

FastAPI/Starlette (Optional)

If you have Starlette installed, you can use the built-in middleware:

from fastapi import FastAPI
from a2a_llm_tracker import TrackerMiddleware

app = FastAPI()
app.add_middleware(TrackerMiddleware)

TrackerMiddleware now also reads and propagates call-lineage headers when present:

X-Call-ID
X-Parent-Call-ID
X-Sequence-Number

Calling Backend Proxy Agents With Tracking IDs

Use call_agent to call your backend agent URL and automatically forward the current tracking IDs:

X-Request-ID
X-Session-ID
X-Trace-ID

from a2a_llm_tracker import call_agent, set_request_id, set_session_id, set_context

set_request_id("req-123")
set_session_id("sess-123")
set_context(trace_id="trace-123")

response = call_agent(
    "https://my-backend-agent.example.com/run",
    payload={"task": "summarize this"},
)

Async usage:

from a2a_llm_tracker import call_agent_async

response = await call_agent_async(
    "https://my-backend-agent.example.com/run",
    payload={"task": "summarize this"},
)

High-Level A2A/Orchestra Agent Call (Recommended)

Use call_orchestra_agent when you just want to pass text (or parts) and let the SDK:

build JSON-RPC payload
handle message/send vs message/stream
parse response text
persist returned tracking headers
auto-generate call lineage headers (X-Call-ID, X-Parent-Call-ID, X-Sequence-Number)
automatically send bearer auth using CCS token when available (or user_token override)

from a2a_llm_tracker import call_orchestra_agent

result = call_orchestra_agent(
    input_message="Summarize this contract in 3 bullet points.",
    agent_url="https://node.a2aorchestra.com/api/v1/agents/sendmessage/123/.well-known/agent-card.json",
    stream=False,
)

print(result["success"], result["text"])
print(result["trace_id"], result["session_id"], result["request_id"])
print(result["call_id"], result["parent_call_id"], result["sequence_number"])

Streaming mode (SSE parsed automatically):

result = call_orchestra_agent(
    agent_id="123",  # optional if agent_url already includes agent path
    input_message="Write a short poem.",
    stream=True,
)

print(result["text"])      # aggregated text
print(len(result["events"]))  # parsed SSE events

Override parent linkage manually (optional):

result = call_orchestra_agent(
    agent_id="123",
    input_message="Next step",
    parent_call_id="my-parent-call-id",
)

Multimodal call (text/image/video parts) by passing parts directly:

result = call_orchestra_agent(
    agent_id="123",
    parts=[
        {"kind": "text", "text": "Describe this image"},
        {"kind": "image", "url": "https://example.com/image.jpg"},
    ],
    stream=False,
)

Google ADK Integration

Track LLM usage in Google Agent Development Kit (ADK) agents using the built-in callback:

from google.adk.agents import LlmAgent
from a2a_llm_tracker import create_adk_callback
from tracking import get_meter

meter = get_meter()

agent = LlmAgent(
    name="my_agent",
    model="gemini-2.0-flash",
    instruction="You are a helpful assistant.",
    after_model_callback=create_adk_callback(
        meter=meter,
        agent_id=123,  # Your agent concept ID (integer)
    ),
)

The callback automatically extracts token usage from ADK's LlmResponse.usage_metadata and records it to CCS.

Supported Providers

Provider	ResponseType
OpenAI	`ResponseType.OPENAI`
Google Gemini	`ResponseType.GEMINI`
Anthropic	`ResponseType.ANTHROPIC`
Cohere	`ResponseType.COHERE`
Mistral	`ResponseType.MISTRAL`
Groq	`ResponseType.GROQ`
Together AI	`ResponseType.TOGETHER`
AWS Bedrock	`ResponseType.BEDROCK`
Google Vertex AI	`ResponseType.VERTEX`
Google ADK	`ResponseType.ADK`

Documentation

Full documentation available on GitHub:

LiteLLM Wrapper - Auto-tracking via LiteLLM
CCS Integration - Centralized tracking setup
Response Analysis - Direct SDK tracking
Pricing - Custom pricing configuration
Building - Development and publishing

What This Package Does NOT Do

Guess exact billing from raw text
Replace provider SDKs
Upload data anywhere automatically
Require a backend or SaaS

Project details

These details have not been verified by PyPI

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

0.0.32

Apr 13, 2026

0.0.31

Apr 13, 2026

0.0.30

Apr 13, 2026

0.0.29

Apr 13, 2026

0.0.28

Apr 13, 2026

0.0.27

Apr 13, 2026

0.0.26

Apr 13, 2026

0.0.25

Apr 13, 2026

This version

0.0.24

Apr 10, 2026

0.0.23

Apr 10, 2026

0.0.22

Jan 19, 2026

0.0.21

Jan 19, 2026

0.0.20

Jan 16, 2026

0.0.18

Jan 16, 2026

0.0.17

Jan 16, 2026

0.0.16

Jan 16, 2026

0.0.15

Jan 16, 2026

0.0.14

Jan 15, 2026

0.0.13

Jan 15, 2026

0.0.12

Jan 15, 2026

0.0.11

Dec 25, 2025

0.0.10

Dec 25, 2025

0.0.9

Dec 25, 2025

0.0.8

Dec 24, 2025

0.0.7

Dec 23, 2025

0.0.6

Dec 23, 2025

0.0.5

Dec 23, 2025

0.0.4

Dec 23, 2025

0.0.3

Dec 21, 2025

0.0.2

Dec 21, 2025

0.0.1

Dec 21, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

a2a_llm_tracker-0.0.24.tar.gz (50.6 kB view details)

Uploaded Apr 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

a2a_llm_tracker-0.0.24-py3-none-any.whl (51.7 kB view details)

Uploaded Apr 10, 2026 Python 3

File details

Details for the file a2a_llm_tracker-0.0.24.tar.gz.

File metadata

Download URL: a2a_llm_tracker-0.0.24.tar.gz
Upload date: Apr 10, 2026
Size: 50.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for a2a_llm_tracker-0.0.24.tar.gz
Algorithm	Hash digest
SHA256	`18293b3bd230ac280d21a127461401417972d6293c379f9e2816f550c338905d`
MD5	`56822b6ce595bf7bd5939ffb9a12c83c`
BLAKE2b-256	`ed01cf6235892c3b67d3da370ced286ee292d97cbe85a1469baaea0526ce1152`

See more details on using hashes here.

File details

Details for the file a2a_llm_tracker-0.0.24-py3-none-any.whl.

File metadata

Download URL: a2a_llm_tracker-0.0.24-py3-none-any.whl
Upload date: Apr 10, 2026
Size: 51.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for a2a_llm_tracker-0.0.24-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6170410ff379bab6261ffb0fc2d75598f61df5add4b416500fdac32a851d74a7`
MD5	`e1a91c0fb657cbe7b0c330cbf7e78e6e`
BLAKE2b-256	`311f7405563e4b4c24ca36ad8e9e5d08ddefaac72da97203613b4f7876aa7bcc`

See more details on using hashes here.

a2a-llm-tracker 0.0.24

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

a2a-llm-tracker

Installation

Quick Start (Recommended Pattern)

Step 1: Create a tracking module

Step 2: Use it anywhere

Environment Variables

Query Total Usage & Costs

Request Tracking (Multiple LLM Calls per Request)

Basic Usage (Any Framework)

Flask

Django

FastAPI/Starlette (Optional)

Calling Backend Proxy Agents With Tracking IDs

High-Level A2A/Orchestra Agent Call (Recommended)

Google ADK Integration

Supported Providers

Documentation

What This Package Does NOT Do

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes