Skip to main content

Unified SDK for multi-provider LLM comparison (Cerebras, AWS Bedrock) with OpenAI-compatible interface.

Project description

UnifiedAI SDK

OpenAI-compatible Python SDK unifying multiple providers (Cerebras, AWS Bedrock) with Solo and Comparison modes, strict models, and built‑in telemetry.

Highlights

  • OpenAI-compatible API: UnifiedAI().chat.completions.create(...) (sync) and AsyncUnifiedAI (async)
  • Multi-Provider Support: Cerebras and AWS Bedrock (extensible architecture)
  • Dual Modes: Solo execution or side-by-side Comparison
  • Rich Metrics: Duration, TTFB, tokens/sec, provider-specific timing
  • Observability: Structured logs, Prometheus metrics, OpenTelemetry tracing hooks
  • Flexible Credentials: Pass at client construction or use environment variables

Install

From PyPI (core):

pip install unifiedai-sdk

Optional extras:

# AWS Bedrock support (requires boto3)
pip install "unifiedai-sdk[bedrock]"

# HTTP/2 support for httpx
pip install "unifiedai-sdk[http2]"

From GitHub (optional):

pip install git+https://github.com/<your-org-or-user>/<your-repo>.git#subdirectory=cerebras

Usage

Sync (scripts/CLI)

Cerebras Example:

from unifiedai import UnifiedAI

client = UnifiedAI(
    provider="cerebras",
    model="llama3.1-8b",
    credentials={"api_key": "csk-..."},  # or set CEREBRAS_KEY in env
)
resp = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}]
)
print(resp.choices[0].message["content"])
print(f"Tokens: {resp.usage.total_tokens}")

Bedrock Example:

from unifiedai import UnifiedAI

client = UnifiedAI(
    provider="bedrock",
    model="qwen.qwen3-32b-v1:0",
    credentials={
        "aws_access_key_id": "...",
        "aws_secret_access_key": "...",
        "region_name": "us-east-1"
    }
)
resp = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}]
)
print(resp.choices[0].message["content"])

Async (web backends)

from unifiedai import AsyncUnifiedAI

async with AsyncUnifiedAI(provider="cerebras", model="llama3") as client:
    resp = await client.chat.completions.create(
        messages=[{"role": "user", "content": "Hello"}]
    )

Streaming (async)

async with AsyncUnifiedAI(provider="cerebras", model="llama3") as client:
    async for chunk in client.chat.completions.stream(
        messages=[{"role": "user", "content": "Stream this"}]
    ):
        print(chunk.delta.get("content", ""), end="")

Comparison (two providers)

Option 1: Shared Model ID (when both providers use same identifier)

from unifiedai import AsyncUnifiedAI

async with AsyncUnifiedAI(credentials_by_provider={...}) as client:
    result = await client.chat.completions.compare(
        providers=["cerebras", "bedrock"],
        model="llama3.1-8b",  # Same ID for both providers
        messages=[{"role": "user", "content": "Capital of France?"}]
    )
    print(f"Winner: {result.winner}")

Option 2: Per-Provider Models (when model IDs differ)

# Example: Same Llama 3.1 8B model, different IDs per provider
result = await client.chat.completions.compare(
    providers=["cerebras", "bedrock"],
    models={
        "cerebras": "llama3.1-8b",
        "bedrock": "meta.llama3-1-8b-instruct-v1:0"  # Bedrock uses different ID
    },
    messages=[{"role": "user", "content": "Explain quantum computing."}]
)

Synchronous Comparison (for scripts/CLI):

from unifiedai import UnifiedAI

client = UnifiedAI(credentials_by_provider={...})

result = client.chat.completions.compare(
    providers=["cerebras", "bedrock"],
    models={
        "cerebras": "llama3.1-8b",
        "bedrock": "meta.llama3-1-8b-instruct-v1:0"
    },
    messages=[{"role": "user", "content": "Hello"}]
)

print(f"Winner: {result.winner}")
print(f"Cerebras ({result.provider_a.model}): {result.provider_a.metrics.duration_ms:.2f}ms")
print(f"Bedrock ({result.provider_b.model}): {result.provider_b.metrics.duration_ms:.2f}ms")

Credentials

Cerebras

Set environment variable or pass credentials:

# Environment variable
export CEREBRAS_KEY="csk-..."

# Or pass directly
client = UnifiedAI(
    provider="cerebras",
    credentials={"api_key": "csk-..."}
)

AWS Bedrock

Set AWS credentials via environment or pass directly:

# Environment variables
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="us-east-1"

# Or pass directly
client = AsyncUnifiedAI(
    provider="bedrock",
    credentials={
        "aws_access_key_id": "...",
        "aws_secret_access_key": "...",
        "region_name": "us-east-1"
    }
)

Multi-Provider (Comparison Mode)

client = AsyncUnifiedAI(
    credentials_by_provider={
        "cerebras": {"api_key": "csk-..."},
        "bedrock": {
            "aws_access_key_id": "...",
            "aws_secret_access_key": "...",
            "region_name": "us-east-1"
        }
    }
)

Precedence: Direct credentials > Environment variables > IAM roles (for Bedrock)

FastAPI demo (Swagger UI)

uvicorn apps.chat.backend:app --reload --port 8000
# Swagger UI: http://localhost:8000/docs

Supported Models

Cerebras

  • llama3.1-8b - Llama 3.1 8B
  • llama3.1-70b - Llama 3.1 70B
  • qwen-3-32b - Qwen 3 32B

AWS Bedrock

  • qwen.qwen3-32b-v1:0 - Qwen 3 32B
  • anthropic.claude-3-haiku-20240307-v1:0 - Claude 3 Haiku (fastest)
  • anthropic.claude-3-sonnet-20240229-v1:0 - Claude 3 Sonnet
  • anthropic.claude-3-5-sonnet-20240620-v1:0 - Claude 3.5 Sonnet
  • meta.llama3-70b-instruct-v1:0 - Llama 3 70B

Note: Some Bedrock models require requesting access through the AWS Bedrock console.

Response Metrics

All responses include comprehensive metrics:

  • duration_ms: Total SDK round-trip time
  • ttfb_ms: Time to first byte
  • round_trip_time_s: Total time in seconds
  • inference_time_s: Provider-reported inference time
  • output_tokens_per_sec: Output generation speed
  • total_tokens_per_sec: Overall token throughput

Project Structure

  • src/unifiedai/: SDK implementation
    • _client.py, _async_client.py: Public API clients
    • adapters/: Provider implementations (Cerebras, Bedrock)
    • models/: Pydantic data models
    • core/: Comparison orchestration
    • metrics/: Prometheus metrics
  • examples/: Usage examples (solo, streaming, comparison, Bedrock)
  • apps/chat/: Demo FastAPI backend with comparison UI
  • tests/: Unit and integration tests

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unifiedai_sdk-1.0.9.tar.gz (28.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unifiedai_sdk-1.0.9-py3-none-any.whl (33.8 kB view details)

Uploaded Python 3

File details

Details for the file unifiedai_sdk-1.0.9.tar.gz.

File metadata

  • Download URL: unifiedai_sdk-1.0.9.tar.gz
  • Upload date:
  • Size: 28.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for unifiedai_sdk-1.0.9.tar.gz
Algorithm Hash digest
SHA256 a993dba3bc0197cc6c37ea42ee2d90fb127e0f6b4347137546ba3e980c0c944a
MD5 e688f808b99456b9a5c6124e70e2b6e7
BLAKE2b-256 ab1fa1d9bf4445fc27e2ddf00db6ed7939129038e9277736613abd4eb4010e04

See more details on using hashes here.

File details

Details for the file unifiedai_sdk-1.0.9-py3-none-any.whl.

File metadata

  • Download URL: unifiedai_sdk-1.0.9-py3-none-any.whl
  • Upload date:
  • Size: 33.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.0

File hashes

Hashes for unifiedai_sdk-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 6e9f25c91b37e7e96cadd9842257c1c0a4cf8d76dea6feb9ad314dfeb8c041c5
MD5 1fea309df0a0f1f2b33dc322db30e183
BLAKE2b-256 08e6dd231cfba95189d1ce86d4bd23f26bcacf1bbff4b9a2b4539703340a58a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page