Skip to main content

AI API gateway SDK with cost tracking, budget enforcement, and multi-provider routing for LLM applications

Project description

LLMKit

Track what your AI agents cost. One line of code.

PyPI Downloads Stars MIT Scorecard


Cost tracking for LLM APIs. Works with OpenAI, Anthropic, Gemini, Groq, Mistral, Together, and Cohere SDKs. 730+ models priced. Zero config, zero account needed for local tracking.

pip install llmkit-sdk

Zero-config cost tracking

Wrap any OpenAI-compatible client. Costs are estimated locally from a bundled pricing table - no proxy, no account, no network calls:

from llmkit import tracked
from openai import OpenAI

client = OpenAI(http_client=tracked())
res = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "explain CQRS"}],
)
# costs calculated automatically from response usage data

Works the same with Anthropic, Gemini, Groq, Mistral, Together, and Cohere:

from anthropic import Anthropic

client = Anthropic(http_client=tracked())
msg = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "explain event sourcing"}],
)

Collect costs

costs = []
client = OpenAI(http_client=tracked(on_cost=costs.append))

# ... run your agent ...

total = sum(c.total_cost for c in costs if c.total_cost)
print(f"Agent run cost: ${total:.4f}")

Estimate from any response

from llmkit import estimate_cost

cost = estimate_cost(response)
print(f"~${cost.total_cost:.6f}")

How it compares

Feature llmkit-sdk tokencost litellm
Zero-config tracking yes (httpx transport) no (manual call) no (callback setup)
Works with existing SDK code yes (drop-in) no (separate function) yes (but requires litellm wrapper)
Local estimation (no proxy) yes yes no
Budget enforcement yes (via proxy) no yes (but 9+ bypass bugs)
Streaming cost tracking yes no yes
Session grouping yes no no
Models priced 730+ 400+ 100+
Install size ~200KB ~50KB ~50MB

Session tracking

Group costs by agent run:

from llmkit import LLMKit

client = LLMKit(api_key="llmk_...")
agent = client.session()

for task in tasks:
    completion, cost = agent.chat(
        model="gpt-4.1",
        messages=[{"role": "user", "content": task}],
    )

print(f"Session: ${agent.stats.total_cost:.4f} across {agent.stats.request_count} requests")

Streaming

stream = client.chat_stream(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "write a haiku"}],
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

print(f"\nCost: ${stream.cost.total_cost:.6f}")

Proxy mode (budget enforcement)

Route through the LLMKit proxy for hard budget limits, per-key rate limiting, and dashboard analytics:

client = LLMKit(api_key="llmk_...")
completion, cost = client.chat(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "hello"}],
)
print(f"${cost.total_cost:.4f} via {cost.provider}")

Set a $10 daily budget in the dashboard. When it's hit, requests get a 402 - not a log message, an actual block. No more runaway agents.

Async

from llmkit import AsyncLLMKit

client = AsyncLLMKit(api_key="llmk_...")
completion, cost = await client.chat(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "hello"}],
)

No SDK needed

LLMKit is OpenAI-compatible. Any client works:

from openai import OpenAI

client = OpenAI(
    base_url="https://llmkit-proxy.smigolsmigol.workers.dev/v1",
    api_key="llmk_...",
)

Part of LLMKit

This is the Python SDK for LLMKit, an open-source AI API gateway. The full platform includes:

  • Proxy (Cloudflare Workers) - budget enforcement, cost tracking, provider routing
  • Dashboard (Next.js) - analytics, API key management, budget configuration
  • MCP server - 11 tools for Claude Code, Cursor, and Cline cost tracking
  • TypeScript SDK - same features for Node.js/Deno/Bun
  • CLI - wrap any command with llmkit wrap -- node agent.js

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmkit_sdk-0.1.7.tar.gz (27.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llmkit_sdk-0.1.7-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file llmkit_sdk-0.1.7.tar.gz.

File metadata

  • Download URL: llmkit_sdk-0.1.7.tar.gz
  • Upload date:
  • Size: 27.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for llmkit_sdk-0.1.7.tar.gz
Algorithm Hash digest
SHA256 db385e161c456611a38e402d98deb5a2bbf94d56aecaa1735b71c0e429ff9619
MD5 fe3932b74d381f71a0ea92ff26650acd
BLAKE2b-256 30fe59c6237641bf110132d01ae27e2113d3c54d24802e316d869d9b4d618ff5

See more details on using hashes here.

File details

Details for the file llmkit_sdk-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: llmkit_sdk-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for llmkit_sdk-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 594cdafc3f7f300e24b31099beca860b9736fdffdc7c8e7317ce27de70073662
MD5 00db25273c36fab5580c0de70f4e8c3e
BLAKE2b-256 59d96a2386ffffe522061a10089167a2c391c948b134aa71970617d4caf0f12a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page