Python SDK for PolarGrid Edge AI Infrastructure with Full API Support
Project description
PolarGrid SDK
The official Python SDK for PolarGrid Edge AI Infrastructure with full API support and mock data capabilities.
Features
- ✅ Text Inference: Completions and chat completions (streaming support)
- ✅ Voice: Text-to-speech and speech-to-text
- ✅ Model Management: Load, unload, and check model status
- ✅ GPU Management: Monitor and manage GPU resources
- ✅ Mock Data Mode: Develop without backend (perfect for frontend work)
- ✅ Full Type Hints: Complete type annotations with Pydantic models
- ✅ Async & Sync: Both async and synchronous clients
- ✅ Error Handling: Comprehensive error types
- ✅ Retry Logic: Automatic retry with exponential backoff
Installation
pip install polargrid-sdk
Quick Start
Async Client (Recommended)
import asyncio
from polargrid import PolarGrid
async def main():
# Development mode (with mock data)
client = PolarGrid(
use_mock_data=True, # Enable mock mode
debug=True, # See what's happening
)
# All methods work with realistic mock data
response = await client.chat_completion({
"model": "llama-3.1-8b",
"messages": [
{"role": "user", "content": "Hello!"}
]
})
print(response.choices[0].message.content)
asyncio.run(main())
Sync Client
from polargrid import PolarGridSync
# For synchronous code
client = PolarGridSync(
api_key="pg_your_api_key",
use_mock_data=False,
)
response = client.chat_completion({
"model": "llama-3.1-8b",
"messages": [{"role": "user", "content": "Hello!"}]
})
print(response.choices[0].message.content)
API Reference
Text Inference
Chat Completions
# Non-streaming
response = await client.chat_completion({
"model": "llama-3.1-8b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is quantum computing?"}
],
"max_tokens": 150,
"temperature": 0.7,
})
print(response.choices[0].message.content)
# Streaming
async for chunk in client.chat_completion_stream({
"model": "llama-3.1-8b",
"messages": [{"role": "user", "content": "Tell me a story"}],
}):
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Text Completions
response = await client.completion({
"prompt": "Once upon a time",
"model": "llama-3.1-8b",
"max_tokens": 100,
"temperature": 0.8,
})
print(response.choices[0].text)
Voice / Audio
Text-to-Speech
# Generate audio
audio_buffer = await client.text_to_speech({
"model": "tts-1",
"input": "Hello from PolarGrid!",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.0,
})
# Save to file
with open("speech.mp3", "wb") as f:
f.write(audio_buffer)
# Streaming TTS
async for chunk in client.text_to_speech_stream({
"model": "tts-1",
"input": "Long text to convert...",
"voice": "nova",
}):
audio_stream.write(chunk)
Speech-to-Text
from pathlib import Path
# Transcribe audio
transcription = await client.transcribe(
file=Path("recording.mp3"),
request={
"model": "whisper-1",
"language": "en",
"response_format": "json",
}
)
print(transcription.text)
# Verbose transcription with timestamps
from polargrid.types import VerboseTranscriptionResponse
verbose = await client.transcribe(
file=Path("recording.mp3"),
request={
"model": "whisper-1",
"response_format": "verbose_json",
}
)
if isinstance(verbose, VerboseTranscriptionResponse):
for segment in verbose.segments:
print(f"[{segment.start}s - {segment.end}s]: {segment.text}")
# Translate to English
translation = await client.translate(
file=Path("spanish_audio.mp3"),
request={
"model": "whisper-1",
"response_format": "json",
}
)
print(translation.text)
Model Management
# List available models
response = await client.list_models()
for model in response.data:
print(f"{model.id} ({model.owned_by})")
# Load a model
result = await client.load_model({
"model_name": "llama-3.1-70b",
"force_reload": False,
})
print(result.message)
# Check model status
status = await client.get_model_status()
print("Loaded models:", status.loaded)
print("Loading status:", status.loading_status)
# Unload a model
await client.unload_model({"model_name": "gpt2"})
# Unload all models
result = await client.unload_all_models()
print(f"Unloaded {result.total_unloaded} models")
GPU Management
# Get detailed GPU status
gpu_status = await client.get_gpu_status()
for gpu in gpu_status.gpus:
print(f"GPU {gpu.index}: {gpu.name}")
print(f" Memory: {gpu.memory.used_gb}GB / {gpu.memory.total_gb}GB")
print(f" Utilization: {gpu.utilization.gpu_percent}%")
print(f" Temperature: {gpu.temperature_c}°C")
# Get simplified memory info
memory = await client.get_gpu_memory()
print(f"Memory used: {memory.memory[0].used_gb}GB ({memory.memory[0].percent_used}%)")
# Purge GPU memory
purge_result = await client.purge_gpu({"force": False})
print(f"Freed {purge_result.memory_freed_gb}GB")
print(f"Unloaded models: {purge_result.models_unloaded}")
print(purge_result.recommendation)
Health Check
health = await client.health()
print(f"Status: {health.status}")
print(f"Backend healthy: {health.backend.healthy}")
print(f"Models loaded: {health.backend.info.models_loaded}")
Error Handling
from polargrid import (
PolarGrid,
is_polargrid_error,
AuthenticationError,
ValidationError,
RateLimitError,
ServerError,
NetworkError,
)
try:
response = await client.chat_completion({
"model": "llama-3.1-8b",
"messages": [{"role": "user", "content": "Hello"}],
})
except Exception as error:
if is_polargrid_error(error):
print(f"PolarGrid Error: {error.message}")
print(f"Request ID: {error.request_id}")
if isinstance(error, AuthenticationError):
# Handle auth errors
pass
elif isinstance(error, ValidationError):
# Handle validation errors
print("Details:", error.details)
elif isinstance(error, RateLimitError):
# Handle rate limits
print(f"Retry after: {error.retry_after}s")
Configuration Options
client = PolarGrid(
# API key (required for production, optional for mock mode)
api_key="pg_your_api_key",
# Base URL (default: auto-route via https://autorouter.polargrid.ai)
base_url="https://autorouter.polargrid.ai",
# Request timeout in seconds (default: 30.0)
timeout=30.0,
# Maximum retry attempts (default: 3)
max_retries=3,
# Enable debug logging (default: False)
debug=True,
# Use mock data instead of real API (default: False)
use_mock_data=True,
)
Mock Data for Development
The SDK includes comprehensive mock data that matches the API spec exactly:
Why Use Mock Data?
- Frontend Development: Build UI components before backend is ready
- Testing: Predictable responses for unit tests
- Demos: Show realistic flows without production infrastructure
- Development: Faster iteration without API calls
What's Mocked?
- ✅ All text inference endpoints with realistic responses
- ✅ Voice TTS and STT with proper audio formats
- ✅ Model management with state simulation
- ✅ GPU metrics with realistic utilization data
- ✅ Streaming responses (both text and audio)
Environment Variables
# API Key
export POLARGRID_API_KEY=pg_your_api_key
# Base URL (optional)
export NEXT_PUBLIC_INFERENCE_URL=https://autorouter.polargrid.ai
Type Support
Full type hints with Pydantic models:
from polargrid.types import (
ChatCompletionRequest,
ChatCompletionResponse,
ModelInfo,
GPUStatusResponse,
)
Best Practices
1. Use Mock Data During Development
import os
is_development = os.environ.get("ENV") == "development"
client = PolarGrid(
api_key=os.environ.get("POLARGRID_API_KEY"),
use_mock_data=is_development,
debug=is_development,
)
2. Handle Errors Gracefully
import asyncio
async def with_retry(request):
try:
return await client.chat_completion(request)
except RateLimitError as e:
# Wait and retry
await asyncio.sleep(e.retry_after or 60)
return await client.chat_completion(request)
3. Use Streaming for Long Responses
# Better user experience for long-form content
async for chunk in client.chat_completion_stream(request):
if chunk.choices[0].delta.content:
update_ui(chunk.choices[0].delta.content)
Development
# Install dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run tests with coverage
pytest --cov
# Type checking
mypy src/polargrid
# Linting
ruff check src/polargrid
License
MIT
Support
- Documentation: https://docs.polargrid.ai
- Issues: https://github.com/your-org/polargrid-sdk/issues
- Email: support@polargrid.ai
- Made with ❄️ by the PolarGrid team.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
polargrid_sdk-0.7.0.tar.gz
(37.2 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polargrid_sdk-0.7.0.tar.gz.
File metadata
- Download URL: polargrid_sdk-0.7.0.tar.gz
- Upload date:
- Size: 37.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
71c47eb0eeb515d5a462dc38c63f36c985a1ea04a1a08a739e15248721b2b6ef
|
|
| MD5 |
d5f33f1bf32a9f87c5ab04feb8c2c54c
|
|
| BLAKE2b-256 |
682ab86b2300db12b9de96da2b855d120823d20d32cfb44d8eed43695ef1d693
|
File details
Details for the file polargrid_sdk-0.7.0-py3-none-any.whl.
File metadata
- Download URL: polargrid_sdk-0.7.0-py3-none-any.whl
- Upload date:
- Size: 32.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c871aaace3e308467ce62c4df6a793bdfaa8949059f58bb6236a0ca25865ff8
|
|
| MD5 |
7469e220423450469e48a8b098cd8385
|
|
| BLAKE2b-256 |
3c9e1bca42abf8df5d40a62341a08c0951eac3b6c069d889de0341e8d4469616
|