Production-grade LLM client for Python - 100+ providers, 11,000+ models. Rust-powered.
Project description
LLMKit Python
The production-grade LLM client for Python. Native Rust performance with a Pythonic API.
Why LLMKit?
- Rust Core — Native performance, memory safety, no GIL limitations
- 100+ Providers — OpenAI, Anthropic, Google, AWS Bedrock, Azure, Groq, and more
- 11,000+ Models — Built-in registry with pricing and capabilities
- Prompt Caching — Save up to 90% on API costs with native caching support
- Extended Thinking — Unified reasoning API across 5 providers
- Production Ready — No memory leaks, no worker restarts, runs forever
Installation
pip install llmkit-python
Quick Start
from llmkit import LLMKitClient, CompletionRequest, Message
# Create client from environment variables
client = LLMKitClient.from_env()
# Make a completion request
response = client.complete(CompletionRequest(
model="anthropic/claude-sonnet-4-20250514",
messages=[Message.user("Hello!")]
))
print(response.text_content())
Async Support
from llmkit import AsyncLLMKitClient, CompletionRequest, Message
async def main():
client = AsyncLLMKitClient.from_env()
response = await client.complete(CompletionRequest(
model="openai/gpt-4o",
messages=[Message.user("Hello!")]
))
print(response.text_content())
Streaming
# Sync streaming
for chunk in client.complete_stream(request):
if chunk.text:
print(chunk.text, end="", flush=True)
# Async streaming
async for chunk in await async_client.complete_stream(request):
if chunk.text:
print(chunk.text, end="", flush=True)
Tool Calling
from llmkit import ToolBuilder
# Build tools with fluent API
weather_tool = ToolBuilder("get_weather") \
.description("Get current weather for a location") \
.string_param("city", "City name", required=True) \
.enum_param("unit", "Temperature unit", ["celsius", "fahrenheit"]) \
.build()
request = CompletionRequest(
model="anthropic/claude-sonnet-4-20250514",
messages=[Message.user("What's the weather in Tokyo?")]
).with_tools([weather_tool])
response = client.complete(request)
for tool_call in response.tool_calls():
print(f"Call {tool_call.name} with {tool_call.arguments}")
Prompt Caching
Save up to 90% on repeated prompts:
# Large system prompts are automatically cached
request = CompletionRequest(
model="anthropic/claude-sonnet-4-20250514",
messages=[
Message.system(large_system_prompt), # Cached after first call
Message.user("Question 1")
]
).with_cache()
# Subsequent calls reuse the cached system prompt
response = client.complete(request)
print(f"Cache savings: {response.usage.cache_read_tokens} tokens")
Extended Thinking
Unified reasoning across Anthropic, OpenAI, Google, DeepSeek, and OpenRouter:
request = CompletionRequest(
model="anthropic/claude-sonnet-4-20250514",
messages=[Message.user("Solve this step by step: ...")]
).with_thinking(budget_tokens=10000)
response = client.complete(request)
print("Reasoning:", response.thinking_content())
print("Answer:", response.text_content())
Model Registry
11,000+ models with pricing and capabilities — no API calls needed:
from llmkit import get_model_info, get_models_by_provider, get_models_with_capability
# Get model details instantly
info = get_model_info("anthropic/claude-sonnet-4-20250514")
print(f"Context: {info.context_window:,} tokens")
print(f"Input: ${info.input_price}/1M tokens")
print(f"Output: ${info.output_price}/1M tokens")
# Find models by provider
anthropic_models = get_models_by_provider("anthropic")
# Find models with specific capabilities
vision_models = get_models_with_capability(vision=True)
Features
| Feature | Status |
|---|---|
| Chat Completions | Supported |
| Streaming | Supported |
| Tool Calling | Supported |
| Structured Output | Supported |
| Extended Thinking | Supported |
| Prompt Caching | Supported |
| Vision/Images | Supported |
| Embeddings | Supported |
| Image Generation | Supported |
| Audio STT/TTS | Supported |
| Video Generation | Supported |
Documentation
License
MIT OR Apache-2.0
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llmkit_python-0.1.3-cp38-abi3-win_amd64.whl.
File metadata
- Download URL: llmkit_python-0.1.3-cp38-abi3-win_amd64.whl
- Upload date:
- Size: 9.5 MB
- Tags: CPython 3.8+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
848fad4d25ddea0cdabe8f69a2bdac3a789a2e1b1c98c6d98410d82bb2306626
|
|
| MD5 |
60dd0d64d4a9be50f71ab99bfce34e2b
|
|
| BLAKE2b-256 |
03aedae3ff295ac6155509caa43eb820b2355f906d808b7a668b233cbbbdb89b
|
File details
Details for the file llmkit_python-0.1.3-cp38-abi3-manylinux_2_39_x86_64.whl.
File metadata
- Download URL: llmkit_python-0.1.3-cp38-abi3-manylinux_2_39_x86_64.whl
- Upload date:
- Size: 10.0 MB
- Tags: CPython 3.8+, manylinux: glibc 2.39+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a09d8652bdab2f5e6adafd6358cb468aa723ac101bbb18740684eeff543645f
|
|
| MD5 |
46221cc8bd9aa77ee0b320c5017fd950
|
|
| BLAKE2b-256 |
4ef8f6a87c20888aedbafa5527e64eade59b8f20fe4978d93ef08d2b953a95ef
|
File details
Details for the file llmkit_python-0.1.3-cp38-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: llmkit_python-0.1.3-cp38-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 9.2 MB
- Tags: CPython 3.8+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad440f6d3ea06e01bf1753bc8fc929a115e437429193afba313e779854ab631b
|
|
| MD5 |
32fcd98d3671b82e573fc82fa2a98d33
|
|
| BLAKE2b-256 |
8f3433a7b8a46038b2a43e97a797bb6b9b715901614b4c22569eb4999fc52ad8
|
File details
Details for the file llmkit_python-0.1.3-cp38-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: llmkit_python-0.1.3-cp38-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 9.5 MB
- Tags: CPython 3.8+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
338337d7ce15526dd0657cc20864138a5df91e446a9c6ab50512d3ed33e1eab5
|
|
| MD5 |
97e94c5d5b487c9f17910938a46ae2fa
|
|
| BLAKE2b-256 |
048ccc8d8ed44aeb519142bd983bb0c6d948d26acee717a28284c1192fad792d
|