Skip to main content

Python SDK for PolarGrid Edge AI Infrastructure with Full API Support

Project description

PolarGrid SDK

The official Python SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.

Features

  • Text Inference: Completions and chat completions (streaming support)
  • Voice: Text-to-speech and speech-to-text
  • Model Management: Load, unload, and check model status
  • GPU Management: Monitor and manage GPU resources
  • Mock Data Mode: Develop without backend (perfect for frontend work)
  • Full Type Hints: Complete type annotations with Pydantic models
  • Async & Sync: Both async and synchronous clients
  • Error Handling: Comprehensive error types
  • Retry Logic: Automatic retry with exponential backoff

Installation

pip install polargrid-sdk

Quick Start

Async Client (Recommended)

import asyncio
from polargrid import PolarGrid

async def main():
    # Development mode (with mock data)
    client = PolarGrid(
        use_mock_data=True,  # Enable mock mode
        debug=True,          # See what's happening
    )

    # All methods work with realistic mock data
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [
            {"role": "user", "content": "Hello!"}
        ]
    })

    print(response.choices[0].message.content)

asyncio.run(main())

Sync Client

from polargrid import PolarGridSync

# For synchronous code
client = PolarGridSync(
    api_key="pg_your_api_key",
    use_mock_data=False,
)

response = client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
})

print(response.choices[0].message.content)

API Reference

Text Inference

Chat Completions

# Non-streaming
response = await client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is quantum computing?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7,
})

print(response.choices[0].message.content)
# Streaming
async for chunk in client.chat_completion_stream({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Tell me a story"}],
}):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Text Completions

response = await client.completion({
    "prompt": "Once upon a time",
    "model": "llama-3.1-8b",
    "max_tokens": 100,
    "temperature": 0.8,
})

print(response.choices[0].text)

Voice / Audio

Text-to-Speech

# Generate audio
audio_buffer = await client.text_to_speech({
    "model": "tts-1",
    "input": "Hello from PolarGrid!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0,
})

# Save to file
with open("speech.mp3", "wb") as f:
    f.write(audio_buffer)
# Streaming TTS
async for chunk in client.text_to_speech_stream({
    "model": "tts-1",
    "input": "Long text to convert...",
    "voice": "nova",
}):
    audio_stream.write(chunk)

Speech-to-Text

from pathlib import Path

# Transcribe audio
transcription = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "language": "en",
        "response_format": "json",
    }
)

print(transcription.text)
# Verbose transcription with timestamps
from polargrid.types import VerboseTranscriptionResponse

verbose = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "verbose_json",
    }
)

if isinstance(verbose, VerboseTranscriptionResponse):
    for segment in verbose.segments:
        print(f"[{segment.start}s - {segment.end}s]: {segment.text}")
# Translate to English
translation = await client.translate(
    file=Path("spanish_audio.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "json",
    }
)

print(translation.text)

Model Management

# List available models
response = await client.list_models()
for model in response.data:
    print(f"{model.id} ({model.owned_by})")
# Load a model
result = await client.load_model({
    "model_name": "llama-3.1-70b",
    "force_reload": False,
})

print(result.message)
# Check model status
status = await client.get_model_status()
print("Loaded models:", status.loaded)
print("Loading status:", status.loading_status)
# Unload a model
await client.unload_model({"model_name": "gpt2"})

# Unload all models
result = await client.unload_all_models()
print(f"Unloaded {result.total_unloaded} models")

GPU Management

# Get detailed GPU status
gpu_status = await client.get_gpu_status()
for gpu in gpu_status.gpus:
    print(f"GPU {gpu.index}: {gpu.name}")
    print(f"  Memory: {gpu.memory.used_gb}GB / {gpu.memory.total_gb}GB")
    print(f"  Utilization: {gpu.utilization.gpu_percent}%")
    print(f"  Temperature: {gpu.temperature_c}°C")
# Get simplified memory info
memory = await client.get_gpu_memory()
print(f"Memory used: {memory.memory[0].used_gb}GB ({memory.memory[0].percent_used}%)")
# Purge GPU memory
purge_result = await client.purge_gpu({"force": False})
print(f"Freed {purge_result.memory_freed_gb}GB")
print(f"Unloaded models: {purge_result.models_unloaded}")
print(purge_result.recommendation)

Health Check

health = await client.health()
print(f"Status: {health.status}")
print(f"Backend healthy: {health.backend.healthy}")
print(f"Models loaded: {health.backend.info.models_loaded}")

Error Handling

from polargrid import (
    PolarGrid,
    is_polargrid_error,
    AuthenticationError,
    ValidationError,
    RateLimitError,
    ServerError,
    NetworkError,
)

try:
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [{"role": "user", "content": "Hello"}],
    })
except Exception as error:
    if is_polargrid_error(error):
        print(f"PolarGrid Error: {error.message}")
        print(f"Request ID: {error.request_id}")

        if isinstance(error, AuthenticationError):
            # Handle auth errors
            pass
        elif isinstance(error, ValidationError):
            # Handle validation errors
            print("Details:", error.details)
        elif isinstance(error, RateLimitError):
            # Handle rate limits
            print(f"Retry after: {error.retry_after}s")

Configuration Options

client = PolarGrid(
    # API key (required for production, optional for mock mode)
    api_key="pg_your_api_key",

    # Base URL (default: https://api.polargrid.ai)
    base_url="https://api.polargrid.ai",

    # JWT token exchange URL (default: /api/auth/inference-token)
    auth_url="/api/auth/inference-token",

    # Request timeout in seconds (default: 30.0)
    timeout=30.0,

    # Maximum retry attempts (default: 3)
    max_retries=3,

    # Enable debug logging (default: False)
    debug=True,

    # Use mock data instead of real API (default: False)
    use_mock_data=True,
)

Mock Data for Development

The SDK includes comprehensive mock data that matches the API spec exactly:

Why Use Mock Data?

  1. Frontend Development: Build UI components before backend is ready
  2. Testing: Predictable responses for unit tests
  3. Demos: Show realistic flows without production infrastructure
  4. Development: Faster iteration without API calls

What's Mocked?

  • ✅ All text inference endpoints with realistic responses
  • ✅ Voice TTS and STT with proper audio formats
  • ✅ Model management with state simulation
  • ✅ GPU metrics with realistic utilization data
  • ✅ Streaming responses (both text and audio)

Environment Variables

# API Key
export POLARGRID_API_KEY=pg_your_api_key

# Base URL (optional)
export NEXT_PUBLIC_INFERENCE_URL=https://api.polargrid.ai

Type Support

Full type hints with Pydantic models:

from polargrid.types import (
    ChatCompletionRequest,
    ChatCompletionResponse,
    ModelInfo,
    GPUStatusResponse,
)

Best Practices

1. Use Mock Data During Development

import os

is_development = os.environ.get("ENV") == "development"

client = PolarGrid(
    api_key=os.environ.get("POLARGRID_API_KEY"),
    use_mock_data=is_development,
    debug=is_development,
)

2. Handle Errors Gracefully

import asyncio

async def with_retry(request):
    try:
        return await client.chat_completion(request)
    except RateLimitError as e:
        # Wait and retry
        await asyncio.sleep(e.retry_after or 60)
        return await client.chat_completion(request)

3. Use Streaming for Long Responses

# Better user experience for long-form content
async for chunk in client.chat_completion_stream(request):
    if chunk.choices[0].delta.content:
        update_ui(chunk.choices[0].delta.content)

Development

# Install dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov

# Type checking
mypy src/polargrid

# Linting
ruff check src/polargrid

License

MIT

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polargrid_sdk-0.3.0.tar.gz (23.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polargrid_sdk-0.3.0-py3-none-any.whl (23.1 kB view details)

Uploaded Python 3

File details

Details for the file polargrid_sdk-0.3.0.tar.gz.

File metadata

  • Download URL: polargrid_sdk-0.3.0.tar.gz
  • Upload date:
  • Size: 23.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for polargrid_sdk-0.3.0.tar.gz
Algorithm Hash digest
SHA256 30efb6416ee5ba0a268b67f74406c44484b8b93cc1e9a5d17e7c577e15ffed6f
MD5 5b8307d01795e063f52db34515365fe5
BLAKE2b-256 3c6564adc7b992861961c963c4623ccefa4b24ce912124ca61df2910062a1189

See more details on using hashes here.

File details

Details for the file polargrid_sdk-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: polargrid_sdk-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 23.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for polargrid_sdk-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3a2a24b2aef7ae051b82e7b4db7632b389e946bec5770dd1b34213f7b87189cc
MD5 972cce08177433c9168ef3e994fc425b
BLAKE2b-256 5d652a4fed9e409357f3730779781b7335b54fd7821958b021fe58fa49696189

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page