Skip to main content

A unified async Python wrapper for multiple LLM providers

Project description

SmartLLM

A unified async Python wrapper for multiple LLM providers with a consistent interface.

Python Version License

Features

  • Unified Interface - Single API for multiple LLM providers (OpenAI, AWS Bedrock)
  • Async/Await - Built on asyncio for high-performance concurrent requests
  • Smart Caching - Automatic response caching to reduce costs and latency
  • Auto Retry - Exponential backoff retry logic for transient failures
  • Structured Output - Native Pydantic model support for type-safe responses
  • Streaming - Real-time streaming responses for better UX
  • Rate Limiting - Built-in concurrency control per model
  • Colored Logging - Beautiful console output for debugging

Installation

pip install smartllm

Optional Dependencies

Install only the providers you need:

# For OpenAI
pip install smartllm[openai]

# For AWS Bedrock
pip install smartllm[bedrock]

# For all providers
pip install smartllm[all]

Quick Start

Basic Usage

import asyncio
from smartllm import LLMClient, TextRequest

async def main():
    # Auto-detects provider from environment variables
    async with LLMClient(provider="openai") as client:
        response = await client.generate_text(
            TextRequest(prompt="What is the capital of France?")
        )
        print(response.text)

asyncio.run(main())

Multi-turn Conversations

from smartllm import LLMClient, MessageRequest, Message

async with LLMClient(provider="openai") as client:
    messages = [
        Message(role="user", content="My name is Alice."),
        Message(role="assistant", content="Nice to meet you, Alice!"),
        Message(role="user", content="What's my name?"),
    ]
    
    response = await client.send_message(
        MessageRequest(messages=messages)
    )
    print(response.text)  # "Your name is Alice."

Streaming Responses

from smartllm import LLMClient, TextRequest

async with LLMClient(provider="openai") as client:
    request = TextRequest(
        prompt="Write a short poem about Python.",
        stream=True
    )
    
    async for chunk in client.generate_text_stream(request):
        print(chunk.text, end="", flush=True)

Structured Output with Pydantic

from pydantic import BaseModel
from smartllm import LLMClient, TextRequest

class Person(BaseModel):
    name: str
    age: int
    occupation: str

async with LLMClient(provider="openai") as client:
    response = await client.generate_text(
        TextRequest(
            prompt="Generate a person profile for a software engineer named John, age 30.",
            response_format=Person
        )
    )
    
    person = response.structured_data
    print(f"{person.name} is a {person.age} year old {person.occupation}")

Configuration

Environment Variables

OpenAI:

export OPENAI_API_KEY="your-api-key"
export OPENAI_MODEL="gpt-4o-mini"  # Optional

AWS Bedrock:

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION="us-east-1"
export BEDROCK_MODEL="anthropic.claude-3-sonnet-20240229-v1:0"  # Optional

Programmatic Configuration

from smartllm import LLMClient, LLMConfig

config = LLMConfig(
    provider="openai",
    api_key="your-api-key",
    default_model="gpt-4o",
    temperature=0.7,
    max_tokens=2048,
    max_retries=3,
)

async with LLMClient(config) as client:
    # Use client...
    pass

Customizing Defaults

from smartllm import defaults

# Modify global defaults
defaults.DEFAULT_TEMPERATURE = 0.7
defaults.DEFAULT_MAX_TOKENS = 4096
defaults.DEFAULT_MAX_RETRIES = 5

Advanced Features

Caching

Responses are automatically cached when temperature=0:

# First call - hits API
response1 = await client.generate_text(
    TextRequest(prompt="What is 2+2?", temperature=0)
)

# Second call - uses cache (instant, free)
response2 = await client.generate_text(
    TextRequest(prompt="What is 2+2?", temperature=0)
)

# Clear cache for specific request
response3 = await client.generate_text(
    TextRequest(prompt="What is 2+2?", temperature=0, clear_cache=True)
)

Concurrent Requests

import asyncio
from smartllm import LLMClient, TextRequest

async with LLMClient(provider="openai") as client:
    prompts = ["Question 1", "Question 2", "Question 3"]
    
    tasks = [
        client.generate_text(TextRequest(prompt=p))
        for p in prompts
    ]
    
    responses = await asyncio.gather(*tasks)

Rate Limiting

# Limit concurrent requests
client = LLMClient(provider="openai", max_concurrent=5)

Provider-Specific Clients

For advanced use cases, access provider-specific clients:

from smartllm.openai import OpenAILLMClient, OpenAIConfig
from smartllm.bedrock import BedrockLLMClient, BedrockConfig

# OpenAI-specific features
openai_config = OpenAIConfig(api_key="...", organization="...")
async with OpenAILLMClient(openai_config) as client:
    models = await client.list_available_models()

# Bedrock-specific features
bedrock_config = BedrockConfig(aws_region="us-east-1")
async with BedrockLLMClient(bedrock_config) as client:
    models = await client.list_available_model_ids()

Supported Providers

  • OpenAI - GPT models via OpenAI API
  • AWS Bedrock - Claude, Llama, Mistral, and Titan models

API Reference

Core Classes

  • LLMClient - Unified client for all providers
  • LLMConfig - Unified configuration
  • TextRequest - Single prompt request
  • MessageRequest - Multi-turn conversation request
  • TextResponse - LLM response with metadata
  • Message - Conversation message
  • StreamChunk - Streaming response chunk

Request Parameters

Parameter Type Description Default
prompt str Input text prompt Required
model str Model ID to use Config default
temperature float Sampling temperature (0-1) 0
max_tokens int Maximum output tokens 2048
top_p float Nucleus sampling 1.0
system_prompt str System context None
stream bool Enable streaming False
response_format BaseModel Pydantic model for structured output None
use_cache bool Enable caching True
clear_cache bool Clear cache before request False

Error Handling

from smartllm import LLMClient, TextRequest

async with LLMClient(provider="openai") as client:
    try:
        response = await client.generate_text(
            TextRequest(prompt="Hello")
        )
    except ValueError as e:
        print(f"Configuration error: {e}")
    except Exception as e:
        print(f"API error: {e}")

Development

Setup

git clone https://github.com/Redundando/smartllm.git
cd smartllm
pip install -r requirements-dev.txt

Running Tests

# Unit tests
pytest tests/unit/ -v

# Integration tests (requires API keys)
export OPENAI_API_KEY="your-key"
pytest tests/integration/ -v

# All tests
pytest tests/ -v

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

Version 0.1.0

  • Initial public release
  • Unified interface for multiple providers
  • OpenAI support (GPT models)
  • AWS Bedrock support (Claude, Llama, Mistral, Titan)
  • Async/await architecture
  • Smart caching with temperature=0
  • Auto retry with exponential backoff
  • Structured output with Pydantic models
  • Streaming responses
  • Rate limiting and concurrency control
  • Comprehensive test suite

Support

Acknowledgments

Built with love using:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

smartllm-0.1.0.tar.gz (28.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

smartllm-0.1.0-py3-none-any.whl (26.8 kB view details)

Uploaded Python 3

File details

Details for the file smartllm-0.1.0.tar.gz.

File metadata

  • Download URL: smartllm-0.1.0.tar.gz
  • Upload date:
  • Size: 28.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for smartllm-0.1.0.tar.gz
Algorithm Hash digest
SHA256 340a588b4a3a6755562986b895fba031b27bff6982b965d1df9bc97453880b8b
MD5 8545696b1d78fb5983941948dea10215
BLAKE2b-256 01c81225cc594c3e003290a220777b5a2a499e6d134adae7c02c49523836c826

See more details on using hashes here.

File details

Details for the file smartllm-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: smartllm-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 26.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for smartllm-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 27040808aed451d0023a47813f06ec577140422cf3f020c6f0762df78bf79ec9
MD5 aa092410e72e0497823879fdd4a944db
BLAKE2b-256 e87bdb4ec16905cddd9f3cfe6e668d240ed9e6306f63cf6407fe463c5049eec5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page