Skip to main content

Unified SDK for multi-provider LLM comparison (Cerebras, AWS Bedrock) with OpenAI-compatible interface.

Project description

UnifiedAI SDK

OpenAI-compatible Python SDK unifying multiple providers (Cerebras, AWS Bedrock) with Solo and Comparison modes, strict models, and builtโ€‘in telemetry.

Highlights

  • ๐Ÿ”„ 100% Backwards Compatible: Drop-in replacements for Cerebras SDK and boto3 Bedrock
  • OpenAI-compatible API: UnifiedAI().chat.completions.create(...) (sync) and AsyncUnifiedAI (async)
  • Multi-Provider Support: Cerebras and AWS Bedrock (extensible architecture)
  • Dual Modes: Solo execution or side-by-side Comparison
  • Rich Metrics: Duration, TTFB, tokens/sec, provider-specific timing
  • Observability: Structured logs, Prometheus metrics, OpenTelemetry tracing hooks
  • Flexible Credentials: Pass at client construction or use environment variables
  • ๐ŸŒŸ Cross-Provider Access: Use Cerebras models through Bedrock API (and vice versa!)

๐ŸŽฏ Three Ways to Use UnifiedAI

UnifiedAI offers three interfaces - choose the one that fits your needs:

Interface Use Case Migration Effort
Cerebras Compat Migrating from Cerebras SDK Change 1 line
Bedrock Compat Migrating from boto3 Bedrock Change 1 line
UnifiedAI Native New projects, multi-provider Learn new API

๐Ÿ”ท Interface 1: Cerebras SDK Compatibility

Drop-in Replacement for Cerebras Cloud SDK

Migration: Change one line in your code:

# BEFORE (Cerebras SDK)
from cerebras.cloud.sdk import Cerebras

client = Cerebras(api_key="sk-...")
response = client.chat.completions.create(
    model="llama3.1-8b",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)
# AFTER (UnifiedAI - 100% compatible!)
from unifiedai import Cerebras  # โ† Only change this line!

client = Cerebras(api_key="sk-...")  # Same constructor
response = client.chat.completions.create(  # Same method
    model="llama3.1-8b",  # Same model IDs
    messages=[{"role": "user", "content": "Hello"}]  # Same message format
)
print(response.choices[0].message.content)  # Same response format

โœ… What's Compatible

  • Constructor: Cerebras(api_key="...", base_url="...")
  • Chat Completions: client.chat.completions.create(...)
  • List Models: client.models.list()
  • Streaming: stream=True parameter
  • Parameters: temperature, max_tokens, top_p, etc.
  • Response Format: OpenAI-style choices, usage, model
  • Async Support: AsyncCerebras for async/await

๐ŸŒŸ NEW: Access Bedrock Models (Cross-Provider!)

from unifiedai import Cerebras

client = Cerebras(api_key="sk-...")

# Use Cerebras models (native)
response = client.chat.completions.create(
    model="llama3.1-8b",
    messages=[{"role": "user", "content": "Hello"}]
)

# Use AWS Bedrock models (NEW!)
response = client.chat.completions.create(
    model="bedrock.anthropic.claude-3-haiku-20240307-v1:0",  # โ† Bedrock model!
    messages=[{"role": "user", "content": "Hello from Bedrock!"}]
)

Async Example

from unifiedai import AsyncCerebras

async with AsyncCerebras(api_key="sk-...") as client:
    response = await client.chat.completions.create(
        model="llama3.1-8b",
        messages=[{"role": "user", "content": "Hello"}]
    )
    print(response.choices[0].message.content)

๐Ÿ”ถ Interface 2: AWS Bedrock Compatibility

Drop-in Replacement for boto3 bedrock-runtime

Migration: Replace boto3.client() with BedrockRuntime():

# BEFORE (boto3 bedrock-runtime)
import boto3

client = boto3.client('bedrock-runtime', region_name='us-east-1')
response = client.converse(
    modelId='anthropic.claude-3-haiku-20240307-v1:0',
    messages=[
        {
            "role": "user",
            "content": [{"text": "Hello"}]
        }
    ]
)
print(response['output']['message']['content'][0]['text'])
# AFTER (UnifiedAI - 100% compatible!)
from unifiedai import BedrockRuntime  # โ† Replace boto3.client()

client = BedrockRuntime(region_name='us-east-1')  # Same parameters
response = client.converse(  # Same method
    modelId='anthropic.claude-3-haiku-20240307-v1:0',  # Same model IDs
    messages=[  # Same message format
        {
            "role": "user",
            "content": [{"text": "Hello"}]
        }
    ]
)
print(response['output']['message']['content'][0]['text'])  # Same response format

โœ… What's Compatible

  • Constructor: BedrockRuntime(region_name="...", aws_access_key_id="...", ...)
  • Converse API: client.converse(modelId="...", messages=[...], inferenceConfig={...})
  • List Models: client.list_foundation_models(byProvider="...")
  • Message Format: Bedrock-style with content as list of dicts
  • Response Format: boto3-style dict with output, usage, metrics
  • Inference Config: temperature, maxTokens, topP, stopSequences

๐ŸŒŸ NEW: Access Cerebras Models (Cross-Provider!)

from unifiedai import BedrockRuntime

client = BedrockRuntime(
    region_name='us-east-1',
    cerebras_api_key='sk-...'  # Add Cerebras key
)

# Use Bedrock models (native)
response = client.converse(
    modelId='anthropic.claude-3-haiku-20240307-v1:0',
    messages=[
        {"role": "user", "content": [{"text": "Hello"}]}
    ]
)

# Use Cerebras models (NEW!)
response = client.converse(
    modelId='cerebras.llama3.1-8b',  # โ† Cerebras model!
    messages=[
        {"role": "user", "content": [{"text": "Hello from Cerebras!"}]}
    ]
)

List Foundation Models (boto3 compatible)

# List all models
response = client.list_foundation_models()
for model in response['modelSummaries']:
    print(f"{model['modelId']} - {model['providerName']}")

# Filter by provider
response = client.list_foundation_models(byProvider="Anthropic")
response = client.list_foundation_models(byProvider="Cerebras")  # NEW!

๐Ÿ”น Interface 3: UnifiedAI Native (Multi-Provider from Day 1)

For new projects or when you want multi-provider features from the start:

Single Provider (Solo Mode)

from unifiedai import UnifiedAI

# Cerebras
client = UnifiedAI(
    provider="cerebras",
    model="llama3.1-8b",
    credentials={"api_key": "sk-..."}
)

# Bedrock
client = UnifiedAI(
    provider="bedrock",
    model="anthropic.claude-3-haiku-20240307-v1:0",
    credentials={
        "aws_access_key_id": "...",
        "aws_secret_access_key": "...",
        "region_name": "us-east-1"
    }
)

# Use
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message["content"])
print(f"Tokens: {response.usage.total_tokens}")
print(f"Duration: {response.metrics.duration_ms:.2f}ms")

Multi-Provider (Comparison Mode)

A/B test two providers simultaneously:

from unifiedai import UnifiedAI

client = UnifiedAI(
    credentials_by_provider={
        "cerebras": {"api_key": "sk-..."},
        "bedrock": {
            "aws_access_key_id": "...",
            "aws_secret_access_key": "...",
            "region_name": "us-east-1"
        }
    }
)

# Compare same model on different providers
result = client.chat.completions.compare(
    providers=["cerebras", "bedrock"],
    models={
        "cerebras": "llama3.1-8b",
        "bedrock": "meta.llama3-1-8b-instruct-v1:0"  # Same model, different ID
    },
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

# Analyze results
print(f"Winner: {result.winner}")
print(f"Cerebras: {result.provider_a.metrics.duration_ms:.2f}ms")
print(f"Bedrock: {result.provider_b.metrics.duration_ms:.2f}ms")

Async (for FastAPI, web backends)

from unifiedai import AsyncUnifiedAI

async with AsyncUnifiedAI(provider="cerebras", model="llama3.1-8b") as client:
    response = await client.chat.completions.create(
        messages=[{"role": "user", "content": "Hello"}]
    )

Streaming

async with AsyncUnifiedAI(provider="cerebras", model="llama3.1-8b") as client:
    async for chunk in client.chat.completions.stream(
        messages=[{"role": "user", "content": "Write a story"}]
    ):
        print(chunk.delta.get("content", ""), end="")

List Models

# List models from specific provider
models = client.models.list(provider="cerebras")
for model in models:
    print(f"{model.id} - {model.owned_by}")

# List from all configured providers
all_models = client.models.list()

๐Ÿ“Š Interface Comparison

Feature Cerebras Compat Bedrock Compat UnifiedAI Native
Migration 1 line change 1 line change New API
Cerebras Models โœ… Native โœ… With cerebras. prefix โœ… Native
Bedrock Models โœ… With bedrock. prefix โœ… Native โœ… Native
Response Format OpenAI-like boto3 dict OpenAI-like
Comparison Mode โŒ โŒ โœ…
Async Support โœ… AsyncCerebras โŒ (sync only) โœ… AsyncUnifiedAI
Streaming โœ… โŒ (not yet) โœ…
Rich Metrics โœ… Basic โœ… Basic โœ… Comprehensive

Recommendation:

  • Migrating existing code? โ†’ Use compatibility layers (Cerebras or Bedrock)
  • New project? โ†’ Use UnifiedAI Native for maximum features
  • Need comparison mode? โ†’ UnifiedAI Native is the only option

๐Ÿš€ Quick Start Examples

Example 1: Migrate from Cerebras SDK (5 seconds)

# Change one line:
# from cerebras.cloud.sdk import Cerebras
from unifiedai import Cerebras

# Everything else stays the same!

Example 2: Migrate from boto3 Bedrock (10 seconds)

# Change one line:
# client = boto3.client('bedrock-runtime', region_name='us-east-1')
from unifiedai import BedrockRuntime
client = BedrockRuntime(region_name='us-east-1')

# Everything else stays the same!

Example 3: New Project with A/B Testing

from unifiedai import AsyncUnifiedAI

async with AsyncUnifiedAI(credentials_by_provider={...}) as client:
    result = await client.chat.completions.compare(
        providers=["cerebras", "bedrock"],
        models={"cerebras": "llama3.1-8b", "bedrock": "meta.llama3-1-8b-instruct-v1:0"},
        messages=[{"role": "user", "content": "Hello"}]
    )
    print(f"Winner: {result.winner}")

Install

From PyPI (core):

pip install unifiedai-sdk

Optional extras:

# AWS Bedrock support (requires boto3)
pip install "unifiedai-sdk[bedrock]"

# HTTP/2 support for httpx
pip install "unifiedai-sdk[http2]"

From GitHub (optional):

pip install git+https://github.com/<your-org-or-user>/<your-repo>.git#subdirectory=cerebras

๐Ÿ”‘ Credentials Setup

Option 1: Environment Variables (Recommended)

# Cerebras
export CEREBRAS_API_KEY="sk-..."

# AWS Bedrock
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"  # optional, defaults to us-east-1

Option 2: Pass Directly in Code

Cerebras Compat:

from unifiedai import Cerebras
client = Cerebras(api_key="sk-...")

Bedrock Compat:

from unifiedai import BedrockRuntime
client = BedrockRuntime(
    region_name='us-east-1',
    aws_access_key_id='...',
    aws_secret_access_key='...'
)

UnifiedAI Native (Single Provider):

from unifiedai import UnifiedAI

# Cerebras
client = UnifiedAI(
    provider="cerebras",
    credentials={"api_key": "sk-..."}
)

# Bedrock
client = UnifiedAI(
    provider="bedrock",
    credentials={
        "aws_access_key_id": "...",
        "aws_secret_access_key": "...",
        "region_name": "us-east-1"
    }
)

UnifiedAI Native (Multi-Provider for Comparison):

from unifiedai import UnifiedAI

client = UnifiedAI(
    credentials_by_provider={
        "cerebras": {"api_key": "sk-..."},
        "bedrock": {
            "aws_access_key_id": "...",
            "aws_secret_access_key": "...",
            "region_name": "us-east-1"
        }
    }
)

Credential Precedence:
Direct credentials > Environment variables > IAM roles (for Bedrock)

๐Ÿ“š Examples & Demos

Comprehensive Demos (Recommended Starting Point)

  • examples/cerebras_backward_compat_demo.py - Full Cerebras SDK compatibility demo
  • examples/bedrock_backward_compat_demo.py - Full AWS Bedrock compatibility demo

Run these to see all features in action!

Basic Examples

  • examples/solo_chat.py - Simple chat with UnifiedAI
  • examples/comparison_chat.py - A/B testing two providers
  • examples/streaming.py - Streaming responses
  • examples/list_models.py - List available models

FastAPI Demo (Swagger UI)

cd apps/chat/backend
pip install -r requirements.txt
uvicorn backend:app --reload --port 8000
# Swagger UI: http://localhost:8000/docs

Supported Models

Cerebras

  • llama3.1-8b - Llama 3.1 8B
  • llama3.1-70b - Llama 3.1 70B
  • qwen-3-32b - Qwen 3 32B

AWS Bedrock

  • qwen.qwen3-32b-v1:0 - Qwen 3 32B
  • anthropic.claude-3-haiku-20240307-v1:0 - Claude 3 Haiku (fastest)
  • anthropic.claude-3-sonnet-20240229-v1:0 - Claude 3 Sonnet
  • anthropic.claude-3-5-sonnet-20240620-v1:0 - Claude 3.5 Sonnet
  • meta.llama3-70b-instruct-v1:0 - Llama 3 70B

Note: Some Bedrock models require requesting access through the AWS Bedrock console.

Response Metrics

All responses include comprehensive metrics:

  • duration_ms: Total SDK round-trip time
  • ttfb_ms: Time to first byte
  • round_trip_time_s: Total time in seconds
  • inference_time_s: Provider-reported inference time
  • output_tokens_per_sec: Output generation speed
  • total_tokens_per_sec: Overall token throughput

๐Ÿ“ Project Structure

cerebras/
โ”œโ”€โ”€ src/unifiedai/              # SDK implementation
โ”‚   โ”œโ”€โ”€ _client.py              # Sync UnifiedAI client
โ”‚   โ”œโ”€โ”€ _async_client.py        # Async UnifiedAI client
โ”‚   โ”œโ”€โ”€ _cerebras_compat.py     # Cerebras SDK compatibility layer
โ”‚   โ”œโ”€โ”€ _bedrock_compat.py      # AWS Bedrock compatibility layer
โ”‚   โ”œโ”€โ”€ adapters/               # Provider implementations
โ”‚   โ”‚   โ”œโ”€โ”€ base.py             # Base adapter with retries, circuit breakers
โ”‚   โ”‚   โ”œโ”€โ”€ cerebras.py         # Cerebras Cloud SDK adapter
โ”‚   โ”‚   โ””โ”€โ”€ bedrock.py          # AWS Bedrock adapter
โ”‚   โ”œโ”€โ”€ models/                 # Pydantic data models
โ”‚   โ”‚   โ”œโ”€โ”€ request.py          # ChatRequest, Message
โ”‚   โ”‚   โ”œโ”€โ”€ response.py         # UnifiedChatResponse, Usage, Metrics
โ”‚   โ”‚   โ”œโ”€โ”€ comparison.py       # ComparisonResult, ProviderResult
โ”‚   โ”‚   โ””โ”€โ”€ model.py            # Model, ModelList
โ”‚   โ”œโ”€โ”€ core/                   # Core orchestration
โ”‚   โ”‚   โ””โ”€โ”€ comparison.py       # Comparison mode implementation
โ”‚   โ”œโ”€โ”€ metrics/                # Observability
โ”‚   โ”‚   โ””โ”€โ”€ emitter.py          # Prometheus metrics
โ”‚   โ””โ”€โ”€ resilience/             # Resilience patterns
โ”‚       โ””โ”€โ”€ circuit_breaker.py  # Circuit breaker implementation
โ”œโ”€โ”€ examples/                   # Usage examples & demos
โ”‚   โ”œโ”€โ”€ cerebras_backward_compat_demo.py  # โญ Cerebras compat demo
โ”‚   โ”œโ”€โ”€ bedrock_backward_compat_demo.py   # โญ Bedrock compat demo
โ”‚   โ”œโ”€โ”€ solo_chat.py            # Basic usage
โ”‚   โ”œโ”€โ”€ comparison_chat.py      # A/B testing
โ”‚   โ””โ”€โ”€ streaming.py            # Streaming responses
โ”œโ”€โ”€ apps/chat/backend/          # FastAPI demo application
โ”‚   โ”œโ”€โ”€ backend.py              # REST API with Swagger UI
โ”‚   โ””โ”€โ”€ requirements.txt        # Backend dependencies
โ””โ”€โ”€ tests/                      # Test suite (90% coverage)
    โ”œโ”€โ”€ unit/                   # Unit tests
    โ”œโ”€โ”€ integration/            # Integration tests with real providers
    โ””โ”€โ”€ benchmarks/             # Performance benchmarks

๐ŸŽฏ Why UnifiedAI?

For Teams Already Using Cerebras or Bedrock

  • Zero Migration Cost: Change 1 line of code, everything else stays the same
  • Immediate Benefits: Gain cross-provider capabilities without rewriting your code
  • Risk-Free: 100% backward compatible, gradual migration path
  • Future-Proof: Access new providers without changing your interface

For New Projects

  • Multi-Provider from Day 1: Don't lock yourself into a single provider
  • Built-in A/B Testing: Compare providers with compare() method
  • Production-Ready: Retries, circuit breakers, timeouts, metrics
  • OpenAI-Compatible: Familiar API if you've used OpenAI SDK

Key Differentiators

โœ… Three interfaces in one SDK - Use what fits your needs
โœ… Cross-provider access - Cerebras models via Bedrock API (and vice versa)
โœ… Comparison mode - Side-by-side provider testing built-in
โœ… Enhanced metrics - TTFB, tokens/sec, inference time, round-trip time
โœ… Production resilience - Circuit breakers, retries, timeouts
โœ… Full observability - Structured logs, Prometheus metrics, OpenTelemetry hooks


๐Ÿš€ Getting Started

1. Install:

pip install unifiedai-sdk

2. Choose your interface:

  • Migrating from Cerebras SDK? โ†’ Start with examples/cerebras_backward_compat_demo.py
  • Migrating from boto3 Bedrock? โ†’ Start with examples/bedrock_backward_compat_demo.py
  • New project? โ†’ Start with examples/solo_chat.py and examples/comparison_chat.py

3. Set credentials:

export CEREBRAS_API_KEY="sk-..."
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."

4. Start coding!


๐Ÿ“– Documentation

  • README (this file) - Complete usage guide
  • examples/README.md - Detailed example walkthroughs
  • apps/chat/README.md - FastAPI backend setup
  • Inline docstrings - Every method has comprehensive Google-style docstrings

๐Ÿค Contributing

Contributions welcome! The SDK follows production-grade best practices:

  • โœ… 90%+ test coverage
  • โœ… Strict type checking (mypy)
  • โœ… Code formatting (ruff, black)
  • โœ… Pre-commit hooks
  • โœ… Comprehensive CI/CD

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unifiedai_sdk-1.1.1.tar.gz (76.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unifiedai_sdk-1.1.1-py3-none-any.whl (58.4 kB view details)

Uploaded Python 3

File details

Details for the file unifiedai_sdk-1.1.1.tar.gz.

File metadata

  • Download URL: unifiedai_sdk-1.1.1.tar.gz
  • Upload date:
  • Size: 76.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for unifiedai_sdk-1.1.1.tar.gz
Algorithm Hash digest
SHA256 ccab865ea8526a55728bab904bd432e7f4672e847a1bdf2e3a40f03301871555
MD5 1f898bd64443a842cb0ba768cbd854fd
BLAKE2b-256 b155306830dc34bfa84585aba1127ff5b8c4785ff4198c64dc0312affd815059

See more details on using hashes here.

File details

Details for the file unifiedai_sdk-1.1.1-py3-none-any.whl.

File metadata

  • Download URL: unifiedai_sdk-1.1.1-py3-none-any.whl
  • Upload date:
  • Size: 58.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for unifiedai_sdk-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 473b3e1db28a0d21b104a4dedcebee011ae6d43c6a3a4bb971f3f9d4ebffff74
MD5 4e6afbec503f452e346c144149c7b0c9
BLAKE2b-256 4b4213be4f677e592dcc759de833d92ecc37d61ae607324f2b522ce8e7892e0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page