Skip to main content

The official Python SDK for the Pellet AI API

Project description

Pellet AI Python SDK

PyPI version CI Python License: MIT

The official Python client for the Pellet AI API.

Pellet is an intelligent LLM routing platform — send a prompt and Pellet automatically picks the best model for cost, speed, and quality. One API key, 11+ models across Groq, Together, and Fireworks.

Installation

pip install pellet-ai

Requires Python 3.9+.

Quick Start

from pellet import Pellet

client = Pellet(api_key="pk_live_...")  # or set PELLET_API_KEY env var

# Pellet auto-routes to the best model
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "What is machine learning?"}],
)
print(response.choices[0].message.content)
print(f"Model used: {response.model}")
print(f"Task type: {response.pellet_metadata.task_type}")
print(f"Cost: ${response.pellet_metadata.cost_usd:.6f}")

Streaming

with client.chat.completions.stream(
    messages=[{"role": "user", "content": "Tell me a story"}],
) as stream:
    for chunk in stream:
        if chunk.choices and chunk.choices[0].delta.content:
            print(chunk.choices[0].delta.content, end="", flush=True)

Async

import asyncio
from pellet import AsyncPellet

async def main():
    async with AsyncPellet(api_key="pk_live_...") as client:
        response = await client.chat.completions.create(
            messages=[{"role": "user", "content": "Hello!"}],
        )
        print(response.choices[0].message.content)

asyncio.run(main())

Routing Control

Pellet auto-routes by default. Use pellet_config to control routing behavior:

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Write a sorting algorithm"}],
    pellet_config={
        "routing_mode": "quality",       # "auto", "fastest", "cheapest", "quality"
        "max_latency_ms": 5000,          # reject models slower than this
        "provider_preference": ["groq"], # prefer specific providers
    },
)

Or bypass routing entirely by specifying a model:

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}],
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
)

Preview routing decisions without running inference:

decision = client.routing.explain(
    messages=[{"role": "user", "content": "What is ML?"}],
)
print(f"Would route to: {decision.model}")
print(f"Task type: {decision.task_type}")
print(f"Confidence: {decision.confidence:.2f}")
print(f"Alternatives: {[a.model for a in decision.alternatives]}")

Audio Transcription

transcript = client.audio.transcriptions.create(
    file=open("audio.mp3", "rb"),
    model="whisper-large-v3-turbo",  # or "whisper-large-v3", "distil-whisper-large-v3-en"
)
print(transcript.text)
print(f"Latency: {transcript.pellet_metadata.latency_ms}ms")

Models & Health

# List all available models
models = client.models.list()
for m in models.data:
    print(f"{m.id} ({m.tier}, {m.params}) — {m.providers}")

# Check platform health
health = client.health.check()
print(f"Status: {health.status}")  # "ok" or "degraded"

Configuration

client = Pellet(
    api_key="pk_live_...",       # or set PELLET_API_KEY env var
    base_url="https://...",      # or set PELLET_BASE_URL env var (default: https://getpellet.io/v1)
    timeout=30.0,                # read timeout in seconds (default: 60)
    max_retries=3,               # retries on 429/5xx (default: 2)
)

# Per-request timeout override
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Quick question"}],
    timeout=5.0,
)

# Explicit cleanup (or use context manager)
client.close()

Error Handling

import pellet

try:
    response = client.chat.completions.create(
        messages=[{"role": "user", "content": "Hello"}],
    )
except pellet.AuthenticationError:
    print("Invalid API key")
except pellet.RateLimitError:
    print("Too many requests — will auto-retry")
except pellet.InsufficientCreditsError:
    print("Add credits at https://getpellet.io/dashboard/wallet")
except pellet.UpstreamError:
    print("Upstream provider unavailable")
except pellet.APIStatusError as e:
    print(f"HTTP {e.status_code}: {e.message}")
except pellet.APIConnectionError:
    print("Network error")

Migrating from OpenAI SDK

# Before (OpenAI SDK)
from openai import OpenAI
client = OpenAI(api_key="pk_live_...", base_url="https://getpellet.io/v1")
response = client.chat.completions.create(
    model="",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={"pellet_config": {"routing_mode": "fastest"}},
)
metadata = response.model_extra.get("pellet_metadata", {})

# After (Pellet SDK)
from pellet import Pellet
client = Pellet(api_key="pk_live_...")
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello"}],
    pellet_config={"routing_mode": "fastest"},
)
metadata = response.pellet_metadata  # typed!

License

MIT — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pellet_ai-0.1.0.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pellet_ai-0.1.0-py3-none-any.whl (17.6 kB view details)

Uploaded Python 3

File details

Details for the file pellet_ai-0.1.0.tar.gz.

File metadata

  • Download URL: pellet_ai-0.1.0.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pellet_ai-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6db61a03e0f0b723c1103d6f3f763d167be3137df5b4d377c68a11f0a7559b33
MD5 8175f555268aa3220be7630eef2a429e
BLAKE2b-256 663b8a403e0bc4c0bb4dfbbedc52a6acfdc4f3ea79383f4045a738f86bdf5968

See more details on using hashes here.

File details

Details for the file pellet_ai-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pellet_ai-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 17.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for pellet_ai-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 15221271a56050d398123fff950a2dd8836e0b27d1e2e2b200ccb53ce9489ba3
MD5 3f5cd70fd39c77d704437e2b116174f1
BLAKE2b-256 b35e4fe00390a3d2fec65e148d764b59936ab7b89a9edd143aede734e3137965

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page