Skip to main content

Python SDK for PolarGrid Edge AI Infrastructure with Full API Support

Project description

PolarGrid SDK

The official Python SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.

Features

  • Text Inference: Completions and chat completions (streaming support)
  • Voice: Text-to-speech and speech-to-text
  • Model Management: Load, unload, and check model status
  • GPU Management: Monitor and manage GPU resources
  • Mock Data Mode: Develop without backend (perfect for frontend work)
  • Full Type Hints: Complete type annotations with Pydantic models
  • Async & Sync: Both async and synchronous clients
  • Error Handling: Comprehensive error types
  • Retry Logic: Automatic retry with exponential backoff

Installation

pip install polargrid-sdk

Quick Start

Async Client (Recommended)

import asyncio
from polargrid import PolarGrid

async def main():
    # Development mode (with mock data)
    client = PolarGrid(
        use_mock_data=True,  # Enable mock mode
        debug=True,          # See what's happening
    )

    # All methods work with realistic mock data
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [
            {"role": "user", "content": "Hello!"}
        ]
    })

    print(response.choices[0].message.content)

asyncio.run(main())

Sync Client

from polargrid import PolarGridSync

# For synchronous code
client = PolarGridSync(
    api_key="pg_your_api_key",
    use_mock_data=False,
)

response = client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
})

print(response.choices[0].message.content)

API Reference

Text Inference

Chat Completions

# Non-streaming
response = await client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is quantum computing?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7,
})

print(response.choices[0].message.content)
# Streaming
async for chunk in client.chat_completion_stream({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Tell me a story"}],
}):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Text Completions

response = await client.completion({
    "prompt": "Once upon a time",
    "model": "llama-3.1-8b",
    "max_tokens": 100,
    "temperature": 0.8,
})

print(response.choices[0].text)

Voice / Audio

Text-to-Speech

# Generate audio
audio_buffer = await client.text_to_speech({
    "model": "tts-1",
    "input": "Hello from PolarGrid!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0,
})

# Save to file
with open("speech.mp3", "wb") as f:
    f.write(audio_buffer)
# Streaming TTS
async for chunk in client.text_to_speech_stream({
    "model": "tts-1",
    "input": "Long text to convert...",
    "voice": "nova",
}):
    audio_stream.write(chunk)

Speech-to-Text

from pathlib import Path

# Transcribe audio
transcription = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "language": "en",
        "response_format": "json",
    }
)

print(transcription.text)
# Verbose transcription with timestamps
from polargrid.types import VerboseTranscriptionResponse

verbose = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "verbose_json",
    }
)

if isinstance(verbose, VerboseTranscriptionResponse):
    for segment in verbose.segments:
        print(f"[{segment.start}s - {segment.end}s]: {segment.text}")
# Translate to English
translation = await client.translate(
    file=Path("spanish_audio.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "json",
    }
)

print(translation.text)

Model Management

# List available models
response = await client.list_models()
for model in response.data:
    print(f"{model.id} ({model.owned_by})")
# Load a model
result = await client.load_model({
    "model_name": "llama-3.1-70b",
    "force_reload": False,
})

print(result.message)
# Check model status
status = await client.get_model_status()
print("Loaded models:", status.loaded)
print("Loading status:", status.loading_status)
# Unload a model
await client.unload_model({"model_name": "gpt2"})

# Unload all models
result = await client.unload_all_models()
print(f"Unloaded {result.total_unloaded} models")

GPU Management

# Get detailed GPU status
gpu_status = await client.get_gpu_status()
for gpu in gpu_status.gpus:
    print(f"GPU {gpu.index}: {gpu.name}")
    print(f"  Memory: {gpu.memory.used_gb}GB / {gpu.memory.total_gb}GB")
    print(f"  Utilization: {gpu.utilization.gpu_percent}%")
    print(f"  Temperature: {gpu.temperature_c}°C")
# Get simplified memory info
memory = await client.get_gpu_memory()
print(f"Memory used: {memory.memory[0].used_gb}GB ({memory.memory[0].percent_used}%)")
# Purge GPU memory
purge_result = await client.purge_gpu({"force": False})
print(f"Freed {purge_result.memory_freed_gb}GB")
print(f"Unloaded models: {purge_result.models_unloaded}")
print(purge_result.recommendation)

Health Check

health = await client.health()
print(f"Status: {health.status}")
print(f"Backend healthy: {health.backend.healthy}")
print(f"Models loaded: {health.backend.info.models_loaded}")

Error Handling

from polargrid import (
    PolarGrid,
    is_polargrid_error,
    AuthenticationError,
    ValidationError,
    RateLimitError,
    ServerError,
    NetworkError,
)

try:
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [{"role": "user", "content": "Hello"}],
    })
except Exception as error:
    if is_polargrid_error(error):
        print(f"PolarGrid Error: {error.message}")
        print(f"Request ID: {error.request_id}")

        if isinstance(error, AuthenticationError):
            # Handle auth errors
            pass
        elif isinstance(error, ValidationError):
            # Handle validation errors
            print("Details:", error.details)
        elif isinstance(error, RateLimitError):
            # Handle rate limits
            print(f"Retry after: {error.retry_after}s")

Configuration Options

client = PolarGrid(
    # API key (required for production, optional for mock mode)
    api_key="pg_your_api_key",

    # Base URL (default: auto-route via https://autorouter.polargrid.ai)
    base_url="https://autorouter.polargrid.ai",

    # Request timeout in seconds (default: 30.0)
    timeout=30.0,

    # Maximum retry attempts (default: 3)
    max_retries=3,

    # Enable debug logging (default: False)
    debug=True,

    # Use mock data instead of real API (default: False)
    use_mock_data=True,
)

Mock Data for Development

The SDK includes comprehensive mock data that matches the API spec exactly:

Why Use Mock Data?

  1. Frontend Development: Build UI components before backend is ready
  2. Testing: Predictable responses for unit tests
  3. Demos: Show realistic flows without production infrastructure
  4. Development: Faster iteration without API calls

What's Mocked?

  • ✅ All text inference endpoints with realistic responses
  • ✅ Voice TTS and STT with proper audio formats
  • ✅ Model management with state simulation
  • ✅ GPU metrics with realistic utilization data
  • ✅ Streaming responses (both text and audio)

Environment Variables

# API Key
export POLARGRID_API_KEY=pg_your_api_key

# Base URL (optional)
export NEXT_PUBLIC_INFERENCE_URL=https://autorouter.polargrid.ai

Type Support

Full type hints with Pydantic models:

from polargrid.types import (
    ChatCompletionRequest,
    ChatCompletionResponse,
    ModelInfo,
    GPUStatusResponse,
)

Best Practices

1. Use Mock Data During Development

import os

is_development = os.environ.get("ENV") == "development"

client = PolarGrid(
    api_key=os.environ.get("POLARGRID_API_KEY"),
    use_mock_data=is_development,
    debug=is_development,
)

2. Handle Errors Gracefully

import asyncio

async def with_retry(request):
    try:
        return await client.chat_completion(request)
    except RateLimitError as e:
        # Wait and retry
        await asyncio.sleep(e.retry_after or 60)
        return await client.chat_completion(request)

3. Use Streaming for Long Responses

# Better user experience for long-form content
async for chunk in client.chat_completion_stream(request):
    if chunk.choices[0].delta.content:
        update_ui(chunk.choices[0].delta.content)

Development

# Install dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov

# Type checking
mypy src/polargrid

# Linting
ruff check src/polargrid

License

MIT

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polargrid_sdk-0.8.0.tar.gz (38.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polargrid_sdk-0.8.0-py3-none-any.whl (34.2 kB view details)

Uploaded Python 3

File details

Details for the file polargrid_sdk-0.8.0.tar.gz.

File metadata

  • Download URL: polargrid_sdk-0.8.0.tar.gz
  • Upload date:
  • Size: 38.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for polargrid_sdk-0.8.0.tar.gz
Algorithm Hash digest
SHA256 d83caf1edf83d8f0f238ef3dfdab86e0c71dcc355e78808c0bb75f99c32078de
MD5 710d7a7afc02c3bbdb91b711f8e1bee4
BLAKE2b-256 c8409926f341d9e6252f93be9c8eb3e949e5dcb42da8f05e63e0dba64d57884b

See more details on using hashes here.

File details

Details for the file polargrid_sdk-0.8.0-py3-none-any.whl.

File metadata

  • Download URL: polargrid_sdk-0.8.0-py3-none-any.whl
  • Upload date:
  • Size: 34.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for polargrid_sdk-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 144d15d5a5ded9076b88850bb94eebca666ab11c7c6a3e4cbf78cdde6f9dfdfd
MD5 5876b70efed82f8e97120353fa8a110e
BLAKE2b-256 f95fe7dbeab2dd062378fdcef478b39c65180b55984bde3720d9338f70e2793e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page