One-line HTTP-level auto-instrumentation for AI provider cost tracking. Catches every SDK, framework, and custom wrapper via httpx/requests interception. Supports OpenAI, Anthropic, Gemini, Cohere, Mistral, and 8 OpenAI-compatible providers (Groq, xAI, Together, Fireworks, Perplexity, DeepSeek, OpenRouter, Vercel AI Gateway).

These details have not been verified by PyPI

Project links

Project description

aicostguard

One line. Zero secrets. Real-time AI cost tracking for OpenAI, Anthropic, and Gemini.

Drop-in observability for AI provider costs. No proxy, no shared keys, no per-call code.

Status: Beta. v0.1 is feature-complete for the supported configurations listed below. Not yet recommended for production-critical workloads without your own validation.

Install

pip install aicostguard-dev

Activate

Add one line at the top of your application entry file (e.g. app.py, main.py, manage.py):

import aicostguard.auto  # done.

Then export your AI Cost Guard ingestion key:

export AICG_KEY=aicg_xxxxxxxxxxxxxxxxxxxx
export AICG_URL=https://your-aicg-instance.example.com   # or our hosted URL

That's the entire integration. Every OpenAI / Anthropic / Gemini call your application makes is now automatically tracked.

What gets sent

Only this, per AI call:

{
  "provider": "openai",
  "model": "gpt-4o",
  "input_tokens": 1240,
  "output_tokens": 312,
  "latency_ms": 842,
  "feature": "generate_rag_answer"
}

Never sent:

Your AI provider API keys
Your prompts
The AI's responses
Any user data

The feature field is inferred automatically from the calling function name. You can override it explicitly:

import aicostguard as aicg

with aicg.feature("doc-parse"):
    completion = openai_client.chat.completions.create(...)

Supported configurations (v0.1)

Provider	SDK	Sync	Async	Non-streaming	Streaming (with usage opt-in)
OpenAI	`openai` ≥1.30, <2.0	✅	✅	✅	✅
Anthropic	`anthropic` ≥0.40, <1.0	✅	✅	✅	✅
Google Gemini	`google-generativeai` ≥0.8	✅	✅	✅	✅

Not in v0.1 (planned for v0.2):

LangChain / LlamaIndex auto-tagging
Azure OpenAI, AWS Bedrock SDK shapes
Cohere, Mistral SDKs
Streaming WITHOUT usage opt-in (we warn loudly today; tiktoken fallback in v0.2)

If your stack isn't listed yet, use Manual POST — fully supported and language-agnostic.

Runtime support

Runtime	Supported	Sender mode	Notes
CPython 3.9+ long-running server (Flask, FastAPI, Django, Gunicorn, Celery, containers, local dev)	✅	`background`	Daemon thread + bounded queue + `atexit` flush — same behaviour as today
Vercel Python functions	✅	`inline`	Each receipt POSTs synchronously before the handler returns (Python has no `waitUntil` equivalent we can use globally). Adds ~50–200 ms to AI-route response.
AWS Lambda Python	✅	`inline`	Same as Vercel — synchronous send before handler return
GCP Cloud Functions Python	✅	`inline`	Detected via `K_SERVICE` / `FUNCTION_NAME`
Azure Functions Python	✅	`inline`	Detected via `AZURE_FUNCTIONS_ENVIRONMENT`

Why the modes exist. On serverless platforms, the host freezes the container the moment the function returns. A long-running background drain thread does not survive that freeze — receipts queued after the response is sent are silently dropped. The package detects serverless and switches to inline mode: every submit() POSTs synchronously before returning, so by the time the host function returns, the receipt has hit the wire. There is no Python equivalent of Vercel's waitUntil() we can call from a module-scoped sender, so synchronous send is the only safe strategy.

Async-context caveat. Inline mode uses urllib.request.urlopen (synchronous). Inside an asyncio event loop (FastAPI, async Flask) this briefly blocks the loop while the receipt POSTs. A native async sender that uses httpx.AsyncClient is planned for a future release. Track-correctness is unaffected.

Trust contract (CI-enforced)

These properties are runtime-asserted in CI. No release ships without them passing. The relevant test files are linked.

Cannot break your AI calls. test_safety_cannot_break_calls.py — observer exceptions are caught and swallowed; the SDK's original return value is always passed through.
Cannot send prompts or responses. test_safety_payload_keys_only.py — receipts only contain {provider, model, input_tokens, output_tokens, latency_ms, feature?}. Any other key fails the build.
Cannot leak your AI provider API key. test_safety_no_api_key_leak.py — receipt payloads are scanned for the literal API key values during every test run.
Cannot block your call thread. test_safety_overhead_under_2ms.py — observer overhead is asserted <2ms p99 across 1,000 iterations.
No silent failure modes. test_safety_warnings_fire.py — known issues (unsupported SDK version, streaming-without-usage, wrapper-before-import) each emit a clear warning.
No silent receipt loss on serverless. test_safety_serverless_flush.py — with VERCEL=1 (or other serverless env), submit() POSTs synchronously before returning; no background thread is started; errors during the inline POST do not propagate.

These are the entire trust story. Read the source. Run the tests.

Diagnostics

Check what's instrumented and confirm receipts are flowing:

aicg-diagnose
# or:
python -m aicostguard.diagnostics

Output:

AI Cost Guard auto-instrumentation v0.1.0b1
─────────────────────────────────────────────
Instruments:
  ✅ openai 1.50.0          patched
  ✅ anthropic 0.42.0       patched
  ❌ google.generativeai    not installed
─────────────────────────────────────────────
Config:
  AICG_URL  https://your-aicg.example.com (reachable, 142 ms)
  AICG_KEY  aicg_xxxx••••••••••••••••  (valid format)
─────────────────────────────────────────────
Last receipt: 14:32:01 UTC  (200 OK)

How it works (in one paragraph)

When you import aicostguard.auto, the package scans sys.modules for known AI provider SDKs and monkey-patches their response-returning methods. Each patched method delegates to the original SDK call (so your code's behaviour is unchanged), then reads the usage field from the response object and submits a fire-and-forget receipt to AI Cost Guard. Everything is wrapped in try/except at multiple layers so observer errors can never propagate to your application. The technique is identical to the one Sentry, Datadog APM, and OpenTelemetry use for application monitoring — it's been running in production at hyperscaler scale for over a decade.

Configuration reference

Env var	Default	Description
`AICG_KEY`	(none — package is no-op)	Your ingestion key from the AI Cost Guard dashboard.
`AICG_URL`	(none — package is no-op)	Base URL of your AI Cost Guard backend.
`AICG_FEATURE_DEFAULT`	inferred from caller frame	Fallback feature tag when neither inference nor `aicg.feature(...)` applies.
`AICG_DISABLED`	unset	Set to `1` to short-circuit all tracking without removing the package.
`AICG_DEBUG`	unset	Set to `1` to emit verbose debug logs to stderr (do not use in production).

License

Apache 2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0b2 pre-release

May 24, 2026

0.0.1

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aicostguard_dev-0.3.0b2.tar.gz (49.1 kB view details)

Uploaded May 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aicostguard_dev-0.3.0b2-py3-none-any.whl (53.9 kB view details)

Uploaded May 24, 2026 Python 3

File details

Details for the file aicostguard_dev-0.3.0b2.tar.gz.

File metadata

Download URL: aicostguard_dev-0.3.0b2.tar.gz
Upload date: May 24, 2026
Size: 49.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for aicostguard_dev-0.3.0b2.tar.gz
Algorithm	Hash digest
SHA256	`0f9e95b06a4a57da4277498a0b21cd960f6cd2f969ea0f1ee17dd2548ab65769`
MD5	`0dec0931e0d1b95376e24613b652163d`
BLAKE2b-256	`c4455742ac797c57c9fa07cb099ffff28e9c5f9c9b2488663062a99a5a9b1d57`

See more details on using hashes here.

File details

Details for the file aicostguard_dev-0.3.0b2-py3-none-any.whl.

File metadata

Download URL: aicostguard_dev-0.3.0b2-py3-none-any.whl
Upload date: May 24, 2026
Size: 53.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for aicostguard_dev-0.3.0b2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e9493b17ea73c700d9c5598d1bd8ba2ceeac7eca48fde4691bb1980fc0e927ea`
MD5	`9ea9047a27fb601220d57533cda8951a`
BLAKE2b-256	`e79ec17502f24dc8310877c2ffdfa5759a777f28ff5b022ed8fbc4b1e524b044`

See more details on using hashes here.

aicostguard-dev 0.3.0b2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

aicostguard

Install

Activate

What gets sent

Supported configurations (v0.1)

Runtime support

Trust contract (CI-enforced)

Diagnostics

How it works (in one paragraph)

Configuration reference

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes