Skip to main content

Python SDK for PolarGrid Edge AI Infrastructure with Full API Support

Project description

PolarGrid SDK

The official Python SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.

Features

  • Text Inference: Completions and chat completions (streaming support)
  • Voice: Text-to-speech and speech-to-text
  • Model Management: Load, unload, and check model status
  • GPU Management: Monitor and manage GPU resources
  • Mock Data Mode: Develop without backend (perfect for frontend work)
  • Full Type Hints: Complete type annotations with Pydantic models
  • Async & Sync: Both async and synchronous clients
  • Error Handling: Comprehensive error types
  • Retry Logic: Automatic retry with exponential backoff

Installation

pip install polargrid-sdk

Quick Start

Async Client (Recommended)

import asyncio
from polargrid import PolarGrid

async def main():
    # Development mode (with mock data)
    client = PolarGrid(
        use_mock_data=True,  # Enable mock mode
        debug=True,          # See what's happening
    )

    # All methods work with realistic mock data
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [
            {"role": "user", "content": "Hello!"}
        ]
    })

    print(response.choices[0].message.content)

asyncio.run(main())

Sync Client

from polargrid import PolarGridSync

# For synchronous code
client = PolarGridSync(
    api_key="pg_your_api_key",
    use_mock_data=False,
)

response = client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
})

print(response.choices[0].message.content)

API Reference

Text Inference

Chat Completions

# Non-streaming
response = await client.chat_completion({
    "model": "llama-3.1-8b",
    "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is quantum computing?"}
    ],
    "max_tokens": 150,
    "temperature": 0.7,
})

print(response.choices[0].message.content)
# Streaming
async for chunk in client.chat_completion_stream({
    "model": "llama-3.1-8b",
    "messages": [{"role": "user", "content": "Tell me a story"}],
}):
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Text Completions

response = await client.completion({
    "prompt": "Once upon a time",
    "model": "llama-3.1-8b",
    "max_tokens": 100,
    "temperature": 0.8,
})

print(response.choices[0].text)

Voice / Audio

Text-to-Speech

# Generate audio
audio_buffer = await client.text_to_speech({
    "model": "tts-1",
    "input": "Hello from PolarGrid!",
    "voice": "alloy",
    "response_format": "mp3",
    "speed": 1.0,
})

# Save to file
with open("speech.mp3", "wb") as f:
    f.write(audio_buffer)
# Streaming TTS
async for chunk in client.text_to_speech_stream({
    "model": "tts-1",
    "input": "Long text to convert...",
    "voice": "nova",
}):
    audio_stream.write(chunk)

Speech-to-Text

from pathlib import Path

# Transcribe audio
transcription = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "language": "en",
        "response_format": "json",
    }
)

print(transcription.text)
# Verbose transcription with timestamps
from polargrid.types import VerboseTranscriptionResponse

verbose = await client.transcribe(
    file=Path("recording.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "verbose_json",
    }
)

if isinstance(verbose, VerboseTranscriptionResponse):
    for segment in verbose.segments:
        print(f"[{segment.start}s - {segment.end}s]: {segment.text}")
# Translate to English
translation = await client.translate(
    file=Path("spanish_audio.mp3"),
    request={
        "model": "whisper-1",
        "response_format": "json",
    }
)

print(translation.text)

Model Management

# List available models
response = await client.list_models()
for model in response.data:
    print(f"{model.id} ({model.owned_by})")
# Load a model
result = await client.load_model({
    "model_name": "llama-3.1-70b",
    "force_reload": False,
})

print(result.message)
# Check model status
status = await client.get_model_status()
print("Loaded models:", status.loaded)
print("Loading status:", status.loading_status)
# Unload a model
await client.unload_model({"model_name": "gpt2"})

# Unload all models
result = await client.unload_all_models()
print(f"Unloaded {result.total_unloaded} models")

GPU Management

# Get detailed GPU status
gpu_status = await client.get_gpu_status()
for gpu in gpu_status.gpus:
    print(f"GPU {gpu.index}: {gpu.name}")
    print(f"  Memory: {gpu.memory.used_gb}GB / {gpu.memory.total_gb}GB")
    print(f"  Utilization: {gpu.utilization.gpu_percent}%")
    print(f"  Temperature: {gpu.temperature_c}°C")
# Get simplified memory info
memory = await client.get_gpu_memory()
print(f"Memory used: {memory.memory[0].used_gb}GB ({memory.memory[0].percent_used}%)")
# Purge GPU memory
purge_result = await client.purge_gpu({"force": False})
print(f"Freed {purge_result.memory_freed_gb}GB")
print(f"Unloaded models: {purge_result.models_unloaded}")
print(purge_result.recommendation)

Health Check

health = await client.health()
print(f"Status: {health.status}")
print(f"Backend healthy: {health.backend.healthy}")
print(f"Models loaded: {health.backend.info.models_loaded}")

Error Handling

from polargrid import (
    PolarGrid,
    is_polargrid_error,
    AuthenticationError,
    ValidationError,
    RateLimitError,
    ServerError,
    NetworkError,
)

try:
    response = await client.chat_completion({
        "model": "llama-3.1-8b",
        "messages": [{"role": "user", "content": "Hello"}],
    })
except Exception as error:
    if is_polargrid_error(error):
        print(f"PolarGrid Error: {error.message}")
        print(f"Request ID: {error.request_id}")

        if isinstance(error, AuthenticationError):
            # Handle auth errors
            pass
        elif isinstance(error, ValidationError):
            # Handle validation errors
            print("Details:", error.details)
        elif isinstance(error, RateLimitError):
            # Handle rate limits
            print(f"Retry after: {error.retry_after}s")

Configuration Options

client = PolarGrid(
    # API key (required for production, optional for mock mode)
    api_key="pg_your_api_key",

    # Base URL (default: https://api.polargrid.ai)
    base_url="https://api.polargrid.ai",

    # JWT token exchange URL (default: /api/auth/inference-token)
    auth_url="/api/auth/inference-token",

    # Request timeout in seconds (default: 30.0)
    timeout=30.0,

    # Maximum retry attempts (default: 3)
    max_retries=3,

    # Enable debug logging (default: False)
    debug=True,

    # Use mock data instead of real API (default: False)
    use_mock_data=True,
)

Mock Data for Development

The SDK includes comprehensive mock data that matches the API spec exactly:

Why Use Mock Data?

  1. Frontend Development: Build UI components before backend is ready
  2. Testing: Predictable responses for unit tests
  3. Demos: Show realistic flows without production infrastructure
  4. Development: Faster iteration without API calls

What's Mocked?

  • ✅ All text inference endpoints with realistic responses
  • ✅ Voice TTS and STT with proper audio formats
  • ✅ Model management with state simulation
  • ✅ GPU metrics with realistic utilization data
  • ✅ Streaming responses (both text and audio)

Environment Variables

# API Key
export POLARGRID_API_KEY=pg_your_api_key

# Base URL (optional)
export NEXT_PUBLIC_INFERENCE_URL=https://api.polargrid.ai

Type Support

Full type hints with Pydantic models:

from polargrid.types import (
    ChatCompletionRequest,
    ChatCompletionResponse,
    ModelInfo,
    GPUStatusResponse,
)

Best Practices

1. Use Mock Data During Development

import os

is_development = os.environ.get("ENV") == "development"

client = PolarGrid(
    api_key=os.environ.get("POLARGRID_API_KEY"),
    use_mock_data=is_development,
    debug=is_development,
)

2. Handle Errors Gracefully

import asyncio

async def with_retry(request):
    try:
        return await client.chat_completion(request)
    except RateLimitError as e:
        # Wait and retry
        await asyncio.sleep(e.retry_after or 60)
        return await client.chat_completion(request)

3. Use Streaming for Long Responses

# Better user experience for long-form content
async for chunk in client.chat_completion_stream(request):
    if chunk.choices[0].delta.content:
        update_ui(chunk.choices[0].delta.content)

Development

# Install dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run tests with coverage
pytest --cov

# Type checking
mypy src/polargrid

# Linting
ruff check src/polargrid

License

MIT

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polargrid_sdk-0.4.0.tar.gz (27.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polargrid_sdk-0.4.0-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file polargrid_sdk-0.4.0.tar.gz.

File metadata

  • Download URL: polargrid_sdk-0.4.0.tar.gz
  • Upload date:
  • Size: 27.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for polargrid_sdk-0.4.0.tar.gz
Algorithm Hash digest
SHA256 d3e730775650bfd8a36e8c9441a9445c7124f9b4a808bb9ed5409260c1e124e2
MD5 711b9ed2bdd9f08850e1b1bd26243057
BLAKE2b-256 25d738477219747df14e0a13dc46990cf390bb268e247a22364939995072c76a

See more details on using hashes here.

File details

Details for the file polargrid_sdk-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: polargrid_sdk-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 26.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.2

File hashes

Hashes for polargrid_sdk-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5b1e4d59a1d9ce52f29d17eb1ab18ac168fde613b24acb0792f7a867f60337ac
MD5 2745eda6133ddfe99c6f45b3f8447555
BLAKE2b-256 8edc7e6de473db18a4c4176542e8101a5f3ac78f5a06dbf3a26a0e40deba0ba5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page