Skip to main content

Track real LLM model usage and compute live gross margin with Tollgate.

Project description

tollgateai (Python SDK)

Track real LLM model usage and compute live gross margin with Tollgate. The SDK reads the actual usage off each provider response — you never hand-count tokens. Zero dependencies.

Published on PyPI: tollgateai (v0.1.2).

pip install tollgateai

Create an API key in Tollgate → Integrations, then set:

export TOLLGATE_API_KEY=tg_live_xxx
# optional, defaults to the hosted app:
export TOLLGATE_BASE_URL=https://tollgateai.vercel.app

Auto-instrumentation (recommended)

Wrap your provider client once; every call reports real usage in the background.

Anthropic

from anthropic import Anthropic
from tollgate import create_tollgate_client, wrap_anthropic

tollgate = create_tollgate_client()  # reads TOLLGATE_API_KEY

# Pin a run_id so every call in this run is grouped and reports cost only.
run_id = "ticket_8842"
anthropic = wrap_anthropic(
    Anthropic(), tollgate,
    customer_id="cust_A",     # your end customer
    run_id=run_id,
)

# Use the client normally — usage is tracked automatically.
anthropic.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=512,
    messages=[{"role": "user", "content": "Resolve this ticket…"}],
)

# Book revenue once, when the run finishes — "no outcome, no charge".
tollgate.resolve(
    run_id=run_id,
    customer_id="cust_A",
    outcome="resolved",       # "resolved" | "escalated" | "failed"
    revenue_unit_cents=50,    # charge for this resolved unit ($0.50)
)

Outcome-based pricing

Under per-resolution / outcome pricing, only a resolved run earns revenue — an escalated/failed run earns $0 but its provider cost still counts against you. Wrap your client to meter cost on every call, then call resolve() once at the end of the run to book the outcome. For simple per-call billing you can instead pass revenue_unit_cents in the wrap options and skip resolve().

OpenAI

from openai import OpenAI
from tollgate import create_tollgate_client, wrap_openai

tollgate = create_tollgate_client()
openai = wrap_openai(OpenAI(), tollgate, customer_id="cust_A")

openai.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

revenue_unit_cents can also be a callable of the response, e.g. revenue_unit_cents=lambda res: 50 if res.something else 0.

Manual tracking

For providers without a wrapper (Bedrock, custom gateways) or full control:

from tollgate import create_tollgate_client

tollgate = create_tollgate_client()

tollgate.track({
    "customerId": "cust_A",
    "runId": "run_12345",
    "provider": "anthropic",
    "model": "claude-sonnet-4-6",
    "tokensIn": 1200,
    "tokensOut": 450,
    "reasoningTokens": 0,
    "cachedTokens": 0,
    "revenueUnitCents": 50,
    "idempotencyKey": "run_12345#step_1",  # exactly-once: safe to retry
})

Notes

  • Idempotent. Events dedupe on idempotencyKey (auto-set to the provider response id by the wrappers), so retries never double-count.
  • No prompt content is ever sent — only token counts and metadata.
  • Streaming responses are not auto-tracked yet (the wrappers only report when a non-streaming usage is present). Track those manually for now.
  • Non-blocking. Auto-instrumented tracking runs on a background thread; failures go to on_error (default: log a warning) and never break your call.

Licensed for use with Tollgate. Not open source.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tollgateai-0.1.2.tar.gz (5.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tollgateai-0.1.2-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file tollgateai-0.1.2.tar.gz.

File metadata

  • Download URL: tollgateai-0.1.2.tar.gz
  • Upload date:
  • Size: 5.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tollgateai-0.1.2.tar.gz
Algorithm Hash digest
SHA256 4c4f1e0fac6d2723fe526c848141635bad326554547d691cdc508e31552f79f4
MD5 5efad90433ff6065bad63f57aa43fb4c
BLAKE2b-256 6113e6eebe5f514ceb37d9d6ab4d2a9ebf8689f45b6b5250054c788a8f7da22b

See more details on using hashes here.

File details

Details for the file tollgateai-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: tollgateai-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for tollgateai-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d37a5dfaf9c5f430c19c7c9a2cb4fb6c26c6d17b6c93fe014b66b501eee418d4
MD5 6750db992ef513fda5549db45851afa5
BLAKE2b-256 64bea7e2e5f6994cff0f4332fe89f6be57547a4179c1ba63e225afac3d16765d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page