Skip to main content

The official Python SDK for the Pellet AI API

Project description

Pellet AI Python SDK

PyPI version CI Python License: MIT

The official Python client for the Pellet AI API.

Pellet is an intelligent LLM routing platform — send a prompt and Pellet automatically picks the best model for cost, speed, and quality. One API key, 11+ models across Groq, Together, and Fireworks.

Installation

pip install pellet-ai

Requires Python 3.9+.

Quick Start

from pellet import Pellet

client = Pellet(api_key="pk_live_...")  # or set PELLET_API_KEY env var

# Pellet auto-routes to the best model
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "What is machine learning?"}],
)
print(response.choices[0].message.content)
print(f"Model used: {response.model}")
print(f"Task type: {response.pellet_metadata.task_type}")
print(f"Cost: ${response.pellet_metadata.cost_usd:.6f}")

Streaming

with client.chat.completions.stream(
    messages=[{"role": "user", "content": "Tell me a story"}],
) as stream:
    for chunk in stream:
        if chunk.choices and chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

Async

import asyncio
from pellet import AsyncPellet

async def main():
    async with AsyncPellet(api_key="pk_live_...") as client:
        response = await client.chat.completions.create(
            messages=[{"role": "user", "content": "Hello!"}],
        )
        print(response.choices[0].message.content)

asyncio.run(main())

Routing Control

Pellet auto-routes by default. Use pellet_config to control routing behavior:

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Write a sorting algorithm"}],
    pellet_config={
        "routing_mode": "quality",       # "auto", "fastest", "cheapest", "quality"
        "max_latency_ms": 5000,          # reject models slower than this
        "provider_preference": ["groq"], # prefer specific providers
    },
)

Or bypass routing entirely by specifying a model:

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}],
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
)

Preview routing decisions without running inference:

decision = client.routing.explain(
    messages=[{"role": "user", "content": "What is ML?"}],
)
print(f"Would route to: {decision.model}")
print(f"Task type: {decision.task_type}")
print(f"Confidence: {decision.confidence:.2f}")
print(f"Alternatives: {[a.model for a in decision.alternatives]}")

Audio Transcription

transcript = client.audio.transcriptions.create(
    file=open("audio.mp3", "rb"),
    model="whisper-large-v3-turbo",  # or "whisper-large-v3", "distil-whisper-large-v3-en"
)
print(transcript.text)
print(f"Latency: {transcript.pellet_metadata.latency_ms}ms")

Models & Health

# List all available models
models = client.models.list()
for m in models.data:
    print(f"{m.id} ({m.tier}, {m.params}) — {m.providers}")

# Check platform health
health = client.health.check()
print(f"Status: {health.status}")  # "ok" or "degraded"

Configuration

client = Pellet(
    api_key="pk_live_...",       # or set PELLET_API_KEY env var
    base_url="https://...",      # or set PELLET_BASE_URL env var (default: https://getpellet.io/v1)
    timeout=30.0,                # read timeout in seconds (default: 60)
    max_retries=3,               # retries on 429/5xx (default: 2)
)

# Per-request timeout override
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Quick question"}],
    timeout=5.0,
)

# Explicit cleanup (or use context manager)
client.close()

Error Handling

import pellet

try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Hello"}],
    )
except pellet.AuthenticationError:
    print("Invalid API key")
except pellet.RateLimitError:
    print("Too many requests — will auto-retry")
except pellet.InsufficientCreditsError:
    print("Add credits at https://getpellet.io/dashboard/wallet")
except pellet.UpstreamError:
    print("Upstream provider unavailable")
except pellet.APIStatusError as e:
    print(f"HTTP {e.status_code}: {e.message}")
except pellet.APIConnectionError:
    print("Network error")

Migrating from OpenAI SDK

# Before (OpenAI SDK)
from openai import OpenAI
client = OpenAI(api_key="pk_live_...", base_url="https://getpellet.io/v1")
response = client.chat.completions.create(
    model="",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={"pellet_config": {"routing_mode": "fastest"}},
)
metadata = response.model_extra.get("pellet_metadata", {})

# After (Pellet SDK)
from pellet import Pellet
client = Pellet(api_key="pk_live_...")
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}],
    pellet_config={"routing_mode": "fastest"},
)
metadata = response.pellet_metadata  # typed!

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pellet_ai-0.1.1.tar.gz (16.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pellet_ai-0.1.1-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file pellet_ai-0.1.1.tar.gz.

File metadata

  • Download URL: pellet_ai-0.1.1.tar.gz
  • Upload date:
  • Size: 16.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pellet_ai-0.1.1.tar.gz
Algorithm Hash digest
SHA256 433a80467acd03887f7ca66aeb31f21d9ea0ec7db218ade78a0609401ae07b88
MD5 3ab95d725774d765e009e8971bcb4536
BLAKE2b-256 86e792cc3e99365ee0e535fdf2f6d73ce8af1529e2cd024c5cd6094febf0ea87

See more details on using hashes here.

File details

Details for the file pellet_ai-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: pellet_ai-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pellet_ai-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5c9f3f94101b859052efe11934e5cec14f26d1c88c9db96a74e36e781e04abee
MD5 0a9b7ca08183e620d39c16841ba4e7a2
BLAKE2b-256 0a4612eb848ceed6e22a1970d9f38c45ef219abe3c3126e3de195299edeb9430

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page