Skip to main content

Python client for the LiveLLM Server

Reason this release was yanked:

unstable

Project description

LiveLLM Python Client

Python 3.10+ License: MIT

Python client library for the LiveLLM Server - a unified proxy for AI agent, audio, and transcription services.

Features

  • 🚀 Async-first design - Built on httpx for high-performance async operations
  • 🔒 Type-safe - Full type hints and Pydantic validation
  • 🎯 Multi-provider support - OpenAI, Google, Anthropic, Groq, ElevenLabs
  • 🔄 Streaming support - Real-time streaming for agent and audio responses
  • 🛠️ Agent tools - Web search and MCP server integration
  • 🎙️ Audio services - Text-to-speech and transcription
  • Fallback strategies - Sequential and parallel fallback handling
  • 📦 Context manager support - Automatic cleanup with async context managers

Installation

pip install livellm-client

Or with development dependencies:

pip install livellm-client[testing]

Quick Start

import asyncio
from livellm import LivellmClient
from livellm.models import Settings, ProviderKind, AgentRequest, TextMessage, MessageRole
from pydantic import SecretStr

async def main():
    # Initialize the client with context manager for automatic cleanup
    async with LivellmClient(base_url="http://localhost:8000") as client:
        # Configure a provider
        config = Settings(
            uid="my-openai-config",
            provider=ProviderKind.OPENAI,
            api_key=SecretStr("your-api-key")
        )
        await client.update_config(config)
        
        # Run an agent query
        request = AgentRequest(
            provider_uid="my-openai-config",
            model="gpt-4",
            messages=[
                TextMessage(role=MessageRole.USER, content="Hello, how are you?")
            ],
            tools=[]
        )
        
        response = await client.agent_run(request)
        print(response.output)

asyncio.run(main())

Configuration

Client Initialization

from livellm import LivellmClient

# Basic initialization
client = LivellmClient(base_url="http://localhost:8000")

# With timeout
client = LivellmClient(
    base_url="http://localhost:8000",
    timeout=30.0
)

# With pre-configured providers (sync operation)
from livellm.models import Settings, ProviderKind
from pydantic import SecretStr

configs = [
    Settings(
        uid="openai-config",
        provider=ProviderKind.OPENAI,
        api_key=SecretStr("sk-..."),
        base_url="https://api.openai.com/v1"  # Optional custom base URL
    ),
    Settings(
        uid="anthropic-config",
        provider=ProviderKind.ANTHROPIC,
        api_key=SecretStr("sk-ant-..."),
        blacklist_models=["claude-instant-1"]  # Optional model blacklist
    )
]

client = LivellmClient(
    base_url="http://localhost:8000",
    configs=configs
)

Provider Configuration

Supported providers:

  • OPENAI - OpenAI GPT models
  • GOOGLE - Google Gemini models
  • ANTHROPIC - Anthropic Claude models
  • GROQ - Groq models
  • ELEVENLABS - ElevenLabs text-to-speech
# Add a provider configuration
config = Settings(
    uid="unique-provider-id",
    provider=ProviderKind.OPENAI,
    api_key=SecretStr("your-api-key"),
    base_url="https://custom-endpoint.com",  # Optional
    blacklist_models=["deprecated-model"]     # Optional
)
await client.update_config(config)

# Get all configurations
configs = await client.get_configs()

# Delete a configuration
await client.delete_config("unique-provider-id")

Usage Examples

Agent Services

Basic Agent Run

from livellm.models import AgentRequest, TextMessage, MessageRole

request = AgentRequest(
    provider_uid="my-openai-config",
    model="gpt-4",
    messages=[
        TextMessage(role=MessageRole.SYSTEM, content="You are a helpful assistant."),
        TextMessage(role=MessageRole.USER, content="Explain quantum computing")
    ],
    tools=[],
    gen_config={"temperature": 0.7, "max_tokens": 500}
)

response = await client.agent_run(request)
print(f"Output: {response.output}")
print(f"Tokens used - Input: {response.usage.input_tokens}, Output: {response.usage.output_tokens}")

Streaming Agent Response

request = AgentRequest(
    provider_uid="my-openai-config",
    model="gpt-4",
    messages=[
        TextMessage(role=MessageRole.USER, content="Tell me a story")
    ],
    tools=[]
)

stream = await client.agent_run_stream(request)
async for chunk in stream:
    print(chunk.output, end="", flush=True)

Agent with Binary Messages

import base64

# Read and encode image
with open("image.jpg", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

from livellm.models import BinaryMessage

request = AgentRequest(
    provider_uid="my-openai-config",
    model="gpt-4-vision",
    messages=[
        BinaryMessage(
            role=MessageRole.USER,
            content=image_data,
            mime_type="image/jpeg",
            caption="What's in this image?"
        )
    ],
    tools=[]
)

response = await client.agent_run(request)

Agent with Web Search Tool

from livellm.models import WebSearchInput, ToolKind

request = AgentRequest(
    provider_uid="my-openai-config",
    model="gpt-4",
    messages=[
        TextMessage(role=MessageRole.USER, content="What's the latest news about AI?")
    ],
    tools=[
        WebSearchInput(
            kind=ToolKind.WEB_SEARCH,
            search_context_size="high"  # Options: "low", "medium", "high"
        )
    ]
)

response = await client.agent_run(request)

Agent with MCP Server Tool

from livellm.models import MCPStreamableServerInput, ToolKind

request = AgentRequest(
    provider_uid="my-openai-config",
    model="gpt-4",
    messages=[
        TextMessage(role=MessageRole.USER, content="Execute tool")
    ],
    tools=[
        MCPStreamableServerInput(
            kind=ToolKind.MCP_STREAMABLE_SERVER,
            url="http://mcp-server:8080",
            prefix="mcp_",
            timeout=15,
            kwargs={"custom_param": "value"}
        )
    ]
)

response = await client.agent_run(request)

Audio Services

Text-to-Speech

from livellm.models import SpeakRequest, SpeakMimeType

request = SpeakRequest(
    provider_uid="elevenlabs-config",
    model="eleven_turbo_v2",
    text="Hello, this is a test of text to speech.",
    voice="rachel",
    mime_type=SpeakMimeType.MP3,
    sample_rate=44100,
    gen_config={"stability": 0.5, "similarity_boost": 0.75}
)

# Get audio as bytes
audio_bytes = await client.speak(request)
with open("output.mp3", "wb") as f:
    f.write(audio_bytes)

Streaming Text-to-Speech

request = SpeakRequest(
    provider_uid="elevenlabs-config",
    model="eleven_turbo_v2",
    text="This is a longer text that will be streamed.",
    voice="rachel",
    mime_type=SpeakMimeType.MP3,
    sample_rate=44100,
    chunk_size=20  # Chunk size in milliseconds
)

# Stream audio chunks
stream = await client.speak_stream(request)
with open("output.mp3", "wb") as f:
    async for chunk in stream:
        f.write(chunk)

Audio Transcription (Multipart)

# Using multipart upload
with open("audio.mp3", "rb") as f:
    file_tuple = ("audio.mp3", f.read(), "audio/mpeg")

response = await client.transcribe(
    provider_uid="openai-config",
    file=file_tuple,
    model="whisper-1",
    language="en",
    gen_config={"temperature": 0.2}
)

print(f"Transcription: {response.text}")
print(f"Detected language: {response.language}")

Audio Transcription (JSON)

import base64
from livellm.models import TranscribeRequest

with open("audio.mp3", "rb") as f:
    audio_data = base64.b64encode(f.read()).decode("utf-8")

request = TranscribeRequest(
    provider_uid="openai-config",
    model="whisper-1",
    file=("audio.mp3", audio_data, "audio/mpeg"),
    language="en"
)

response = await client.transcribe_json(request)

Fallback Strategies

Sequential Fallback (Try each provider in order)

from livellm.models import AgentFallbackRequest, FallbackStrategy

fallback_request = AgentFallbackRequest(
    requests=[
        AgentRequest(
            provider_uid="primary-provider",
            model="gpt-4",
            messages=[TextMessage(role=MessageRole.USER, content="Hello")],
            tools=[]
        ),
        AgentRequest(
            provider_uid="backup-provider",
            model="claude-3",
            messages=[TextMessage(role=MessageRole.USER, content="Hello")],
            tools=[]
        )
    ],
    strategy=FallbackStrategy.SEQUENTIAL,
    timeout_per_request=30
)

response = await client.agent_run(fallback_request)

Parallel Fallback (Try all providers simultaneously)

fallback_request = AgentFallbackRequest(
    requests=[
        AgentRequest(provider_uid="provider-1", model="gpt-4", messages=messages, tools=[]),
        AgentRequest(provider_uid="provider-2", model="claude-3", messages=messages, tools=[]),
        AgentRequest(provider_uid="provider-3", model="gemini-pro", messages=messages, tools=[])
    ],
    strategy=FallbackStrategy.PARALLEL,
    timeout_per_request=10
)

response = await client.agent_run(fallback_request)

Audio Fallback

from livellm.models import AudioFallbackRequest

fallback_request = AudioFallbackRequest(
    requests=[
        SpeakRequest(provider_uid="elevenlabs", model="model-1", text=text, voice="voice1", 
                     mime_type=SpeakMimeType.MP3, sample_rate=44100),
        SpeakRequest(provider_uid="openai", model="tts-1", text=text, voice="alloy",
                     mime_type=SpeakMimeType.MP3, sample_rate=44100)
    ],
    strategy=FallbackStrategy.SEQUENTIAL
)

audio = await client.speak(fallback_request)

Context Manager Support

The client supports async context managers for automatic cleanup:

async with LivellmClient(base_url="http://localhost:8000") as client:
    config = Settings(uid="temp-config", provider=ProviderKind.OPENAI, 
                      api_key=SecretStr("key"))
    await client.update_config(config)
    
    # Use client...
    response = await client.ping()
    
# Automatically cleans up configs and closes HTTP client

Or manually:

client = LivellmClient(base_url="http://localhost:8000")
try:
    # Use client...
    pass
finally:
    await client.cleanup()

API Reference

Client Methods

Health Check

  • ping() -> SuccessResponse - Check server health

Configuration Management

  • update_config(config: Settings) -> SuccessResponse - Add/update a provider config
  • update_configs(configs: List[Settings]) -> SuccessResponse - Add/update multiple configs
  • get_configs() -> List[Settings] - Get all provider configurations
  • delete_config(config_uid: str) -> SuccessResponse - Delete a provider config

Agent Services

  • agent_run(request: AgentRequest | AgentFallbackRequest) -> AgentResponse - Run agent query
  • agent_run_stream(request: AgentRequest | AgentFallbackRequest) -> AsyncIterator[AgentResponse] - Stream agent response

Audio Services

  • speak(request: SpeakRequest | AudioFallbackRequest) -> bytes - Text-to-speech
  • speak_stream(request: SpeakRequest | AudioFallbackRequest) -> AsyncIterator[bytes] - Streaming TTS
  • transcribe(provider_uid, file, model, language?, gen_config?) -> TranscribeResponse - Multipart transcription
  • transcribe_json(request: TranscribeRequest | TranscribeFallbackRequest) -> TranscribeResponse - JSON transcription

Cleanup

  • cleanup() -> None - Clean up resources and close client
  • __aenter__() / __aexit__() - Async context manager support

Models

Common Models

  • Settings - Provider configuration
  • ProviderKind - Enum of supported providers
  • SuccessResponse - Generic success response
  • BaseRequest - Base class for all requests

Agent Models

  • AgentRequest - Agent query request
  • AgentResponse - Agent query response
  • AgentResponseUsage - Token usage information
  • TextMessage - Text-based message
  • BinaryMessage - Binary message (images, audio, etc.)
  • MessageRole - Enum: USER, MODEL, SYSTEM

Tool Models

  • ToolKind - Enum: WEB_SEARCH, MCP_STREAMABLE_SERVER
  • WebSearchInput - Web search tool configuration
  • MCPStreamableServerInput - MCP server tool configuration

Audio Models

  • SpeakRequest - Text-to-speech request
  • SpeakMimeType - Enum: PCM, WAV, MP3, ULAW, ALAW
  • TranscribeRequest - Transcription request
  • TranscribeResponse - Transcription response

Fallback Models

  • FallbackStrategy - Enum: SEQUENTIAL, PARALLEL
  • AgentFallbackRequest - Agent fallback configuration
  • AudioFallbackRequest - Audio fallback configuration
  • TranscribeFallbackRequest - Transcription fallback configuration

Error Handling

The client raises exceptions for HTTP errors:

try:
    response = await client.agent_run(request)
except Exception as e:
    print(f"Error: {e}")

For more granular error handling:

import httpx

try:
    response = await client.ping()
except httpx.HTTPStatusError as e:
    print(f"HTTP error: {e.response.status_code}")
except httpx.RequestError as e:
    print(f"Request error: {e}")

Development

Running Tests

# Install development dependencies
pip install -e ".[testing]"

# Run tests
pytest tests/

Type Checking

The library is fully typed. Run type checking with:

pip install mypy
mypy livellm

Requirements

  • Python 3.10+
  • httpx >= 0.27.0
  • pydantic >= 2.0.0

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Links

Changelog

See CHANGELOG.md for version history and changes.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livellm-1.1.0.tar.gz (10.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livellm-1.1.0-py3-none-any.whl (14.7 kB view details)

Uploaded Python 3

File details

Details for the file livellm-1.1.0.tar.gz.

File metadata

  • Download URL: livellm-1.1.0.tar.gz
  • Upload date:
  • Size: 10.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.7

File hashes

Hashes for livellm-1.1.0.tar.gz
Algorithm Hash digest
SHA256 0f05e1ea42438ac53f650e40b15fc2261dd53466dfdac1a3bba6d502ace1907e
MD5 ef39311504fa9b97b8ce880883cc1f82
BLAKE2b-256 5c8e92838ff4d64ec4c99750b57323a7cc6555a4dc1f4c9531b60526ebaa28de

See more details on using hashes here.

File details

Details for the file livellm-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: livellm-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.9.7

File hashes

Hashes for livellm-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c4deae9e4b6988e75e3ee92c6e61bb655a52a1bc346ced75c7bf4dfc773b5ba7
MD5 05c0663fb483f190886fb5ada86d83b5
BLAKE2b-256 16deeba847538dddb981cf9611d854b9345f25b8276f382c53c26add14c85c71

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page