llmkit-sdk

AI API gateway SDK with cost tracking, budget enforcement, and multi-provider routing for LLM applications

These details have not been verified by PyPI

Project links

Project description

LLMKit

Track what your AI agents cost. One line of code.

Cost tracking for LLM APIs. Works with OpenAI, Anthropic, Gemini, Groq, Mistral, Together, and Cohere SDKs. 730+ models priced. Zero config, zero account needed for local tracking.

pip install llmkit-sdk

Zero-config cost tracking

Wrap any OpenAI-compatible client. Costs are estimated locally from a bundled pricing table - no proxy, no account, no network calls:

from llmkit import tracked
from openai import OpenAI

client = OpenAI(http_client=tracked())
res = client.chat.completions.create(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "explain CQRS"}],
)
# costs calculated automatically from response usage data

Works the same with Anthropic, Gemini, Groq, Mistral, Together, and Cohere:

from anthropic import Anthropic

client = Anthropic(http_client=tracked())
msg = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "explain event sourcing"}],
)

Collect costs

costs = []
client = OpenAI(http_client=tracked(on_cost=costs.append))

# ... run your agent ...

total = sum(c.total_cost for c in costs if c.total_cost)
print(f"Agent run cost: ${total:.4f}")

Estimate from any response

from llmkit import estimate_cost

cost = estimate_cost(response)
print(f"~${cost.total_cost:.6f}")

How it compares

Feature	llmkit-sdk	tokencost	litellm
Zero-config tracking	yes (httpx transport)	no (manual call)	no (callback setup)
Works with existing SDK code	yes (drop-in)	no (separate function)	yes (but requires litellm wrapper)
Local estimation (no proxy)	yes	yes	no
Budget enforcement	yes (via proxy)	no	yes (but 9+ bypass bugs)
Streaming cost tracking	yes	no	yes
Session grouping	yes	no	no
Models priced	730+	400+	100+
Install size	~200KB	~50KB	~50MB

Session tracking

Group costs by agent run:

from llmkit import LLMKit

client = LLMKit(api_key="llmk_...")
agent = client.session()

for task in tasks:
    completion, cost = agent.chat(
        model="gpt-4.1",
        messages=[{"role": "user", "content": task}],
    )

print(f"Session: ${agent.stats.total_cost:.4f} across {agent.stats.request_count} requests")

Streaming

stream = client.chat_stream(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "write a haiku"}],
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

print(f"\nCost: ${stream.cost.total_cost:.6f}")

Proxy mode (budget enforcement)

Route through the LLMKit proxy for hard budget limits, per-key rate limiting, and dashboard analytics:

client = LLMKit(api_key="llmk_...")
completion, cost = client.chat(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "hello"}],
)
print(f"${cost.total_cost:.4f} via {cost.provider}")

Set a $10 daily budget in the dashboard. When it's hit, requests get a 402 - not a log message, an actual block. No more runaway agents.

Async

from llmkit import AsyncLLMKit

client = AsyncLLMKit(api_key="llmk_...")
completion, cost = await client.chat(
    model="gpt-4.1",
    messages=[{"role": "user", "content": "hello"}],
)

No SDK needed

LLMKit is OpenAI-compatible. Any client works:

from openai import OpenAI

client = OpenAI(
    base_url="https://llmkit-proxy.smigolsmigol.workers.dev/v1",
    api_key="llmk_...",
)

Part of LLMKit

This is the Python SDK for LLMKit, an open-source AI API gateway. The full platform includes:

Proxy (Cloudflare Workers) - budget enforcement, cost tracking, provider routing
Dashboard (Next.js) - analytics, API key management, budget configuration
MCP server - 11 tools for Claude Code, Cursor, and Cline cost tracking
TypeScript SDK - same features for Node.js/Deno/Bun
CLI - wrap any command with llmkit wrap -- node agent.js

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.9

Apr 3, 2026

0.1.8

Apr 3, 2026

This version

0.1.7

Apr 3, 2026

0.1.6

Apr 1, 2026

0.1.5

Mar 28, 2026

0.1.4

Mar 27, 2026

0.1.3

Mar 24, 2026

0.1.2

Mar 22, 2026

0.1.1

Mar 17, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmkit_sdk-0.1.7.tar.gz (27.3 kB view details)

Uploaded Apr 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmkit_sdk-0.1.7-py3-none-any.whl (18.8 kB view details)

Uploaded Apr 3, 2026 Python 3

File details

Details for the file llmkit_sdk-0.1.7.tar.gz.

File metadata

Download URL: llmkit_sdk-0.1.7.tar.gz
Upload date: Apr 3, 2026
Size: 27.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for llmkit_sdk-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`db385e161c456611a38e402d98deb5a2bbf94d56aecaa1735b71c0e429ff9619`
MD5	`fe3932b74d381f71a0ea92ff26650acd`
BLAKE2b-256	`30fe59c6237641bf110132d01ae27e2113d3c54d24802e316d869d9b4d618ff5`

See more details on using hashes here.

File details

Details for the file llmkit_sdk-0.1.7-py3-none-any.whl.

File metadata

Download URL: llmkit_sdk-0.1.7-py3-none-any.whl
Upload date: Apr 3, 2026
Size: 18.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.4

File hashes

Hashes for llmkit_sdk-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`594cdafc3f7f300e24b31099beca860b9736fdffdc7c8e7317ce27de70073662`
MD5	`00db25273c36fab5580c0de70f4e8c3e`
BLAKE2b-256	`59d96a2386ffffe522061a10089167a2c391c948b134aa71970617d4caf0f12a`

See more details on using hashes here.

llmkit-sdk 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Track what your AI agents cost. One line of code.

Zero-config cost tracking

Collect costs

Estimate from any response

How it compares

Session tracking

Streaming

Proxy mode (budget enforcement)

Async

No SDK needed

Part of LLMKit

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes