Skip to main content

Official Perf SDK for Python - AI Runtime Orchestrator

Project description

perf-sdk

Official Python SDK for Perf - the AI Runtime Orchestrator.

Perf automatically picks the best AI model for your prompt based on:

  • Task type and complexity
  • Cost constraints
  • Output reliability
  • Fallback logic

Installation

pip install perf-sdk

Quick Start

from perf import PerfClient

client = PerfClient(api_key="pk_live_your_api_key")

# Simple chat completion
response = client.chat(
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)
# Output: "The capital of France is Paris."

# Access Perf metadata
print(response.perf)
# PerfMetadata(model_used='gpt-4o-mini', cost_usd=0.0001, latency_ms=234, ...)

Streaming

# Stream responses
for chunk in client.chat_stream(
    messages=[{"role": "user", "content": "Write a haiku about coding"}]
):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

# Or collect into a string
content = client.chat_stream_to_string(
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

Async Support

import asyncio
from perf import AsyncPerfClient

async def main():
    async with AsyncPerfClient(api_key="pk_live_your_api_key") as client:
        # Async chat completion
        response = await client.chat(
            messages=[{"role": "user", "content": "Hello!"}]
        )
        print(response.choices[0].message.content)

        # Async streaming
        async for chunk in client.chat_stream(
            messages=[{"role": "user", "content": "Tell me a story"}]
        ):
            if chunk.choices[0].delta.content:
                print(chunk.choices[0].delta.content, end="", flush=True)

asyncio.run(main())

Error Handling

from perf import (
    PerfClient,
    PerfError,
    RateLimitError,
    AuthenticationError,
)
import time

client = PerfClient(api_key="pk_live_your_api_key")

try:
    response = client.chat(
        messages=[{"role": "user", "content": "Hello!"}]
    )
except RateLimitError as e:
    # Wait and retry
    retry_after = e.retry_after or 60
    print(f"Rate limited. Retry after {retry_after} seconds")
    time.sleep(retry_after)
except AuthenticationError:
    print("Invalid API key")
except PerfError as e:
    print(f"API Error: {e.code} - {e.message}")
except Exception as e:
    print(f"Unexpected error: {e}")

Configuration Options

client = PerfClient(
    api_key="pk_live_your_api_key",      # Required
    base_url="https://api.withperf.pro",  # Optional, default shown
    timeout=120.0,                         # Request timeout in seconds (default: 120)
    max_retries=3,                         # Retry attempts (default: 3)
    retry_delay=1.0,                       # Base retry delay in seconds (default: 1)
)

Request Options

response = client.chat(
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
    model="gpt-4o",           # Optional: override model selection
    max_tokens=1000,          # Optional: limit response length
    temperature=0.7,          # Optional: sampling temperature
    max_cost_per_call=0.01,   # Optional: cost budget in USD
    metadata={                # Optional: custom metadata
        "user_id": "123",
        "session_id": "abc",
    },
)

Context Manager

Both sync and async clients support context managers for proper resource cleanup:

# Sync
with PerfClient(api_key="pk_live_xxx") as client:
    response = client.chat(messages=[...])

# Async
async with AsyncPerfClient(api_key="pk_live_xxx") as client:
    response = await client.chat(messages=[...])

Features

  • Full type hints with Pydantic models
  • Sync and async clients for different use cases
  • Streaming support with iterators/async iterators
  • Automatic retries with exponential backoff
  • Typed exceptions for different error types
  • Timeout handling with configurable limits

Requirements

  • Python 3.9 or higher
  • httpx
  • pydantic

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

perf_sdk-0.2.0.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

perf_sdk-0.2.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file perf_sdk-0.2.0.tar.gz.

File metadata

  • Download URL: perf_sdk-0.2.0.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for perf_sdk-0.2.0.tar.gz
Algorithm Hash digest
SHA256 81cf8448cfb1e4e7dc4c5d2d5ca07c508825a2ed72d7ec1f5b0d7ddd62670ab6
MD5 789438a9cb834bd257ae8d41d59afe84
BLAKE2b-256 8ed29e5036bf84cfc2e37d43ca8b83c7aa944075b315070c77fe979d21b65ab6

See more details on using hashes here.

File details

Details for the file perf_sdk-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: perf_sdk-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for perf_sdk-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8ea12aec0a659ee0b916a227d5bcbed21f7d1086dd4d8c1f1d60c649211d6596
MD5 ff6e94db7466785f1cf358f40c5f0991
BLAKE2b-256 2f882643d01cb7aeb0d55ff9a8146631ef9a3ed1f677c20291633282a59a36b9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page