A minimal, generic client for AI models (OpenAI, Anthropic, Google, xAI) with middleware support.
Project description
aiclient-llm
A minimal, unified, and resilient Python client for modern LLMs.
Supports OpenAI, Anthropic (Claude 3), Google (Gemini), and xAI (Grok) with a single, consistent interface.
Documentation 📚
Key Features
- 🦄 Unified API: Works with OpenAI, Anthropic, Google Gemini, and Ollama.
- ⚡ Streaming Support: Real-time responses with a simple iterator interface.
- 👁️ Multimodal (Vision): Send images (paths, URLs, base64) to vision-capable models.
- 🚀 Prompt Caching: Native support for Anthropic Prompt Caching headers.
- 🏗️ Structured Outputs: Native strict JSON Schema support for OpenAI.
- 🛡️ Resilient: Circuit Breakers, Rate Limiters, and automatic retries.
- 🔭 Observability: Tracing and OpenTelemetry hooks.
- 🤖 Agent Primitives: Built-in ReAct loop for tool-using agents.
- 🔌 Model Context Protocol (MCP): Connect to 16K+ external tools (GitHub, Postgres, filesystem).
- 📊 Middleware: Inspect requests, track costs, or log data.
- 🧠 Memory Management: Built-in conversation history with token-aware truncation
- 🧪 Testing Utilities: Mock providers for deterministic unit tests
- 📦 Batch Processing: Efficiently process thousands of requests concurrently
- 🛡️ Type-Safe Errors: Specific exception types for better error handling
Architecture at a Glance
Installation
pip install aiclient-llm
Quick Start
Basic Chat
from aiclient import Client
client = Client(
openai_api_key="sk-...",
anthropic_api_key="sk-ant-..."
)
# Call OpenAI
response = client.chat("gpt-4o").generate("Hello!")
print(response.text)
# Call Claude
response = client.chat("claude-3-opus-20240229").generate("Hello!")
print(response.text)
Multimodal (Vision)
from aiclient.data_types import UserMessage, Text, Image
msg = UserMessage(content=[
Text(text="What's in this image?"),
Image(path="./image.png") # Handles base64 automatically
])
response = client.chat("gpt-4o").generate([msg])
print(response.text)
Agents (Tool Use)
from aiclient.agent import Agent
def get_weather(location: str):
return "Sunny in " + location
agent = Agent(
model=client.chat("gpt-4o"),
tools=[get_weather]
)
print(agent.run("Weather in SF?"))
MCP Integration 🔌
Connect to external tools using the Model Context Protocol.
agent = Agent(
model=client.chat("gpt-4o"),
mcp_servers={
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "./workspace"]
}
}
)
# Agent can now use file system tools!
print(agent.run("List all Python files in the current directory"))
Local LLMs (Ollama) 🏠
Use the provider:model syntax to route requests to local models (e.g., via Ollama).
# Connects to http://localhost:11434/v1 by default
client.chat("ollama:llama3").generate("Why is the sky blue?")
# Connect to custom URL (e.g. LMStudio)
client = Client(ollama_base_url="http://localhost:1234/v1")
client.chat("ollama:mistral").generate("Hi")
Streaming
for chunk in client.chat("gpt-4o").stream("Write a poem"):
print(chunk.text, end="", flush=True)
Configuration
Embeddings
# Generate embeddings using the unified interface
vector = await client.embed("Hello world", model="text-embedding-3-small")
# Batch generation
vectors = await client.embed_batch(["Hello", "World"], model="text-embedding-3-small")
Structured Outputs
from pydantic import BaseModel
class Character(BaseModel):
name: str
class_type: str
# Guaranteed JSON response
char = client.chat("gpt-4o").generate(
"Create a wizard",
response_model=Character
)
print(char.name)
Production Resilience 🛡️
Circuit Breakers
Prevent cascade failures when a provider is down.
from aiclient import CircuitBreaker
cb = CircuitBreaker(failure_threshold=5, recovery_timeout=60)
client.add_middleware(cb)
Rate Limiters
Respect API rate limits automatically.
from aiclient import RateLimiter
rl = RateLimiter(requests_per_minute=60)
client.add_middleware(rl)
Fallback Chains
Automatically ensure high availability.
from aiclient import FallbackChain
fallback = FallbackChain(client, ["gpt-4o", "claude-3-opus", "gemini-1.5-pro"])
response = fallback.generate("Critical query")
Observability 🔭
Cost Tracking
Track spending in real-time across all providers.
from aiclient import CostTrackingMiddleware
tracker = CostTrackingMiddleware()
client.add_middleware(tracker)
# ... after requests ...
print(f"Total Cost: ${tracker.total_cost_usd:.4f}")
Logging & OpenTelemetry
Full visibility into your AI calls.
from aiclient import LoggingMiddleware, OpenTelemetryMiddleware
# Redact API keys from logs automatically
client.add_middleware(LoggingMiddleware(redact_keys=True))
# Export traces to Jaeger/Zipkin/etc
client.add_middleware(OpenTelemetryMiddleware(service_name="my-app"))
Advanced Features
Semantic Caching
Save money by caching responses based on meaning.
from aiclient import SemanticCacheMiddleware
cache = SemanticCacheMiddleware(embedder=my_embedder, threshold=0.9)
client.add_middleware(cache)
Batch Processing
Efficiently process thousands of requests.
results = await client.batch(
["Q1", "Q2", "Q3"],
process_func,
concurrency=10
)
Testing 🧪
Write deterministic unit tests without API keys.
from aiclient import MockProvider
def test_feature():
provider = MockProvider()
provider.add_response("Mocked AI response")
# Client will use this response instead of hitting API
response = provider.parse_response({})
assert response.text == "Mocked AI response"
Community & Support 🤝
Contributing
We welcome contributions! Please see our Contributing Guide for details on how to set up the dev environment and submit PRs.
- Found a bug? Open an Issue
- Have a feature request? Start a Discussion
Support the Project
If aiclient-llm helps you build something cool, consider buying me a coffee or connecting on LinkedIn! ☕
License 📄
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aiclient_llm-1.0.0.tar.gz.
File metadata
- Download URL: aiclient_llm-1.0.0.tar.gz
- Upload date:
- Size: 745.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6ad83c2e32a5c91d9223303ddf2367f7adedd3d5f68b27f1c9e824497650897d
|
|
| MD5 |
e589a520db09b94a5636fa80c22d8d20
|
|
| BLAKE2b-256 |
1b22aa0d474581da7ec7394de11d15b3adda1fb13b8cf57b4d3aad99cb92b92e
|
File details
Details for the file aiclient_llm-1.0.0-py3-none-any.whl.
File metadata
- Download URL: aiclient_llm-1.0.0-py3-none-any.whl
- Upload date:
- Size: 48.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63a0b039490c4e5c5bf40b22bde8e798cbe194b755e6b69b80600ad65fef46b6
|
|
| MD5 |
09886dba31de55a133a5a60ed363191f
|
|
| BLAKE2b-256 |
10da76ecb59ff0771c9b0332fea882c6322b17a3bf80766413c4beb50cbd5737
|