Skip to main content

A minimal, generic client for AI models (OpenAI, Anthropic, Google, xAI) with middleware support.

Project description

aiclient-llm

AIClient Banner

PyPI version Python Versions License: Apache 2.0 Downloads

A minimal, unified, and resilient Python client for modern LLMs.

Supports OpenAI, Anthropic (Claude 3), Google (Gemini), and xAI (Grok) with a single, consistent interface.

Documentation 📚

Key Features

  • 🦄 Unified API: Works with OpenAI, Anthropic, Google Gemini, and Ollama.
  • Streaming Support: Real-time responses with a simple iterator interface.
  • 👁️ Multimodal (Vision): Send images (paths, URLs, base64) to vision-capable models.
  • 🚀 Prompt Caching: Native support for Anthropic Prompt Caching headers.
  • 🏗️ Structured Outputs: Native strict JSON Schema support for OpenAI.
  • 🛡️ Resilient: Circuit Breakers, Rate Limiters, and automatic retries.
  • 🔭 Observability: Tracing and OpenTelemetry hooks.
  • 🤖 Agent Primitives: Built-in ReAct loop for tool-using agents.
  • 🔌 Model Context Protocol (MCP): Connect to 16K+ external tools (GitHub, Postgres, filesystem).
  • 📊 Middleware: Inspect requests, track costs, or log data.
  • 🧠 Memory Management: Built-in conversation history with token-aware truncation
  • 🧪 Testing Utilities: Mock providers for deterministic unit tests
  • 📦 Batch Processing: Efficiently process thousands of requests concurrently
  • 🛡️ Type-Safe Errors: Specific exception types for better error handling

Architecture at a Glance

aiclient-llm Architecture

Installation

pip install aiclient-llm

Quick Start

Basic Chat

from aiclient import Client

client = Client(
    openai_api_key="sk-...",
    anthropic_api_key="sk-ant-..."
)

# Call OpenAI
response = client.chat("gpt-4o").generate("Hello!")
print(response.text)

# Call Claude
response = client.chat("claude-3-opus-20240229").generate("Hello!")
print(response.text)

Multimodal (Vision)

from aiclient.data_types import UserMessage, Text, Image

msg = UserMessage(content=[
    Text(text="What's in this image?"),
    Image(path="./image.png") # Handles base64 automatically
])

response = client.chat("gpt-4o").generate([msg])
print(response.text)

Agents (Tool Use)

from aiclient.agent import Agent

def get_weather(location: str):
    return "Sunny in " + location

agent = Agent(
    model=client.chat("gpt-4o"),
    tools=[get_weather]
)

print(agent.run("Weather in SF?"))

MCP Integration 🔌

Connect to external tools using the Model Context Protocol.

agent = Agent(
    model=client.chat("gpt-4o"),
    mcp_servers={
        "filesystem": {
            "command": "npx",
            "args": ["-y", "@modelcontextprotocol/server-filesystem", "./workspace"]
        }
    }
)

# Agent can now use file system tools!
print(agent.run("List all Python files in the current directory"))

Local LLMs (Ollama) 🏠

Use the provider:model syntax to route requests to local models (e.g., via Ollama).

# Connects to http://localhost:11434/v1 by default
client.chat("ollama:llama3").generate("Why is the sky blue?")

# Connect to custom URL (e.g. LMStudio)
client = Client(ollama_base_url="http://localhost:1234/v1")
client.chat("ollama:mistral").generate("Hi")

Streaming

for chunk in client.chat("gpt-4o").stream("Write a poem"):
    print(chunk.text, end="", flush=True)

Configuration

Embeddings

# Generate embeddings using the unified interface
vector = await client.embed("Hello world", model="text-embedding-3-small")

# Batch generation
vectors = await client.embed_batch(["Hello", "World"], model="text-embedding-3-small")

Structured Outputs

from pydantic import BaseModel

class Character(BaseModel):
    name: str
    class_type: str

# Guaranteed JSON response
char = client.chat("gpt-4o").generate(
    "Create a wizard",
    response_model=Character
)
print(char.name)

Production Resilience 🛡️

Circuit Breakers

Prevent cascade failures when a provider is down.

from aiclient import CircuitBreaker

cb = CircuitBreaker(failure_threshold=5, recovery_timeout=60)
client.add_middleware(cb)

Rate Limiters

Respect API rate limits automatically.

from aiclient import RateLimiter

rl = RateLimiter(requests_per_minute=60)
client.add_middleware(rl)

Fallback Chains

Automatically ensure high availability.

from aiclient import FallbackChain

fallback = FallbackChain(client, ["gpt-4o", "claude-3-opus", "gemini-1.5-pro"])
response = fallback.generate("Critical query")

Observability 🔭

Cost Tracking

Track spending in real-time across all providers.

from aiclient import CostTrackingMiddleware

tracker = CostTrackingMiddleware()
client.add_middleware(tracker)

# ... after requests ...
print(f"Total Cost: ${tracker.total_cost_usd:.4f}")

Logging & OpenTelemetry

Full visibility into your AI calls.

from aiclient import LoggingMiddleware, OpenTelemetryMiddleware

# Redact API keys from logs automatically
client.add_middleware(LoggingMiddleware(redact_keys=True))

# Export traces to Jaeger/Zipkin/etc
client.add_middleware(OpenTelemetryMiddleware(service_name="my-app"))

Advanced Features

Semantic Caching

Save money by caching responses based on meaning.

from aiclient import SemanticCacheMiddleware

cache = SemanticCacheMiddleware(embedder=my_embedder, threshold=0.9)
client.add_middleware(cache)

Batch Processing

Efficiently process thousands of requests.

results = await client.batch(
    ["Q1", "Q2", "Q3"],
    process_func,
    concurrency=10
)

Testing 🧪

Write deterministic unit tests without API keys.

from aiclient import MockProvider

def test_feature():
    provider = MockProvider()
    provider.add_response("Mocked AI response")
    
    # Client will use this response instead of hitting API
    response = provider.parse_response({})
    assert response.text == "Mocked AI response"

Community & Support 🤝

Contributing

We welcome contributions! Please see our Contributing Guide for details on how to set up the dev environment and submit PRs.

Support the Project

If aiclient-llm helps you build something cool, consider buying me a coffee or connecting on LinkedIn! ☕

Buy Me A Coffee LinkedIn

License 📄

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aiclient_llm-1.0.0.tar.gz (745.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

aiclient_llm-1.0.0-py3-none-any.whl (48.8 kB view details)

Uploaded Python 3

File details

Details for the file aiclient_llm-1.0.0.tar.gz.

File metadata

  • Download URL: aiclient_llm-1.0.0.tar.gz
  • Upload date:
  • Size: 745.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aiclient_llm-1.0.0.tar.gz
Algorithm Hash digest
SHA256 6ad83c2e32a5c91d9223303ddf2367f7adedd3d5f68b27f1c9e824497650897d
MD5 e589a520db09b94a5636fa80c22d8d20
BLAKE2b-256 1b22aa0d474581da7ec7394de11d15b3adda1fb13b8cf57b4d3aad99cb92b92e

See more details on using hashes here.

File details

Details for the file aiclient_llm-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: aiclient_llm-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 48.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for aiclient_llm-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 63a0b039490c4e5c5bf40b22bde8e798cbe194b755e6b69b80600ad65fef46b6
MD5 09886dba31de55a133a5a60ed363191f
BLAKE2b-256 10da76ecb59ff0771c9b0332fea882c6322b17a3bf80766413c4beb50cbd5737

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page