Skip to main content

Python SDK for deploying containerized applications on the Basilica GPU cloud

Project description

Basilica Python SDK

The official Python SDK for deploying containerized applications on the Basilica GPU cloud platform.

PyPI Python License

Installation

pip install basilica-sdk

Requirements: Python 3.10+

Quick Start

1. Get an API Token

# Install the Basilica CLI
pip install basilica-cli

# Create an API token
basilica tokens create

# Set the environment variable
export BASILICA_API_TOKEN="basilica_..."

2. Deploy Your First App

from basilica import BasilicaClient

client = BasilicaClient()

# Deploy inline Python code
deployment = client.deploy(
    name="hello",
    source="""
from http.server import HTTPServer, BaseHTTPRequestHandler

class Handler(BaseHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.end_headers()
        self.wfile.write(b'Hello from Basilica!')

HTTPServer(('', 8000), Handler).serve_forever()
""",
    port=8000,
    ttl_seconds=600,  # Auto-delete after 10 minutes
)

print(f"Live at: {deployment.url}")

3. Deploy a FastAPI Application

from basilica import BasilicaClient

client = BasilicaClient()

deployment = client.deploy(
    name="my-api",
    source="""
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
def root():
    return {"message": "Hello from FastAPI!"}

@app.get("/items/{item_id}")
def get_item(item_id: int):
    return {"item_id": item_id, "name": f"Item {item_id}"}

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
""",
    port=8000,
    pip_packages=["fastapi", "uvicorn"],
    ttl_seconds=600,
)

print(f"API docs: {deployment.url}/docs")

Features

High-Level API

The SDK provides a simple deploy() method that handles:

  • Source code packaging
  • Container image selection
  • Dependency installation
  • Health checking and readiness waiting
  • Public URL provisioning
deployment = client.deploy(
    name="my-app",           # Deployment name
    source="app.py",         # File path or inline code
    port=8000,               # Application port
    pip_packages=["flask"],  # Dependencies
    storage=True,            # Persistent storage at /data
    ttl_seconds=3600,        # Auto-cleanup (optional)
)

print(deployment.url)        # Public URL
print(deployment.logs())     # Application logs
deployment.delete()          # Manual cleanup

Decorator API

Define deployments as decorated functions:

import basilica

@basilica.deployment(
    name="my-service",
    port=8000,
    pip_packages=["fastapi", "uvicorn"],
    ttl_seconds=600,
)
def serve():
    from fastapi import FastAPI
    import uvicorn

    app = FastAPI()

    @app.get("/")
    def root():
        return {"status": "running"}

    uvicorn.run(app, host="0.0.0.0", port=8000)

# Deploy by calling the function
deployment = serve()
print(f"Live at: {deployment.url}")

GPU Deployments

Deploy applications with GPU access:

deployment = client.deploy(
    name="pytorch-inference",
    source="inference.py",
    image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime",
    port=8000,
    gpu_count=1,
    gpu_models=["NVIDIA-RTX-A4000"],  # Optional: specific GPU models
    memory="8Gi",
    timeout=300,
)

GPU Flavour Preferences

Filter GPU offerings and control hardware placement with interconnect, region, and spot preferences.

Query GPU offerings

from basilica import BasilicaClient, GpuPriceQuery

client = BasilicaClient()

# SXM interconnect in the US
gpus = client.list_secure_cloud_gpus(
    query=GpuPriceQuery(interconnect="SXM", region="US")
)

# Non-spot offerings only
gpus = client.list_secure_cloud_gpus(
    query=GpuPriceQuery(exclude_spot=True)
)

# Spot offerings only
gpus = client.list_secure_cloud_gpus(
    query=GpuPriceQuery(spot_only=True)
)

Deploy with flavour constraints

All deployment methods (deploy, create_deployment, deploy_vllm, deploy_sglang, and the @deployment decorator) accept flavour parameters:

# create_deployment with flavour
resp = client.create_deployment(
    instance_name="my-inference",
    image="nginx:alpine",
    port=80,
    gpu_count=1,
    gpu_models=["H100"],
    interconnect="SXM",       # SXM or PCIe
    geo="US",                 # US, EU, CA, APAC
    spot=True,                # True=prefer spot, False=exclude spot
    ttl_seconds=600,
)

# deploy() with flavour (waits until ready)
deployment = client.deploy(
    name="gpu-app",
    source="app.py",
    port=8000,
    gpu_count=1,
    gpu_models=["H100"],
    interconnect="SXM",
    geo="US",
    spot=False,
)

# deploy_vllm with flavour
deployment = client.deploy_vllm(
    model="Qwen/Qwen3-0.6B",
    interconnect="SXM",
    geo="US",
    spot=True,
)

# Decorator API with flavour
@basilica.deployment(
    name="my-service",
    port=8000,
    gpu_count=1,
    interconnect="SXM",
    geo="US",
    spot=True,
)
def serve():
    ...
Parameter Description
interconnect GPU interconnect type: "SXM" or "PCIe"
geo Geographic region: "US", "EU", "CA", "APAC"
spot True = prefer spot instances, False = exclude spot
infiniband True = require InfiniBand networking

Persistent Storage

Enable persistent storage mounted at /data:

# Simple: Enable storage at /data
deployment = client.deploy(
    name="stateful-app",
    source="app.py",
    port=8000,
    storage=True,
)

# Custom mount path
deployment = client.deploy(
    name="stateful-app",
    source="app.py",
    port=8000,
    storage="/custom/path",
)

Using volumes with the decorator API:

import basilica

cache = basilica.Volume.from_name("my-cache", create_if_missing=True)

@basilica.deployment(
    name="app-with-storage",
    port=8000,
    volumes={"/data": cache},
)
def serve():
    # Your app can read/write to /data
    pass

Health Checks

Large model deployments (vLLM, SGLang) can take minutes to download and load into GPU memory. Configure custom health check probes to prevent Kubernetes from killing pods before models are ready:

from basilica import BasilicaClient, HealthCheckConfig, ProbeConfig

client = BasilicaClient()

# Deploy SGLang with a 20-minute startup tolerance
deployment = client.deploy_sglang(
    model="Qwen/Qwen2.5-3B-Instruct",
    health_check=HealthCheckConfig(
        startup=ProbeConfig(
            path="/health",
            port=30000,
            initial_delay_seconds=0,
            period_seconds=10,
            timeout_seconds=5,
            failure_threshold=120,  # 120 * 10s = 20 minutes
        ),
        liveness=ProbeConfig(
            path="/health",
            port=30000,
            initial_delay_seconds=120,
            period_seconds=30,
            timeout_seconds=10,
            failure_threshold=5,
        ),
        readiness=ProbeConfig(
            path="/health",
            port=30000,
            initial_delay_seconds=60,
            period_seconds=15,
            timeout_seconds=10,
            failure_threshold=5,
        ),
    ),
    timeout=1200,
)

deploy_vllm() and deploy_sglang() include sensible defaults (10-minute startup tolerance) when no health_check is provided. For very large models, pass your own HealthCheckConfig to extend the startup window.

Health checks work with any deployment method:

# Generic deploy()
deployment = client.deploy(
    name="my-gpu-app",
    source="app.py",
    port=8000,
    gpu_count=1,
    health_check=HealthCheckConfig(
        startup=ProbeConfig(path="/ready", port=8000, failure_threshold=60),
    ),
)

# Decorator API
@basilica.deployment(
    name="my-service",
    port=8000,
    health_check=HealthCheckConfig(
        startup=ProbeConfig(path="/health", port=8000, failure_threshold=60),
    ),
)
def serve():
    ...

ProbeConfig fields:

  • path: HTTP endpoint to probe (e.g. "/health")
  • port: Port to probe (defaults to container port if None)
  • initial_delay_seconds: Seconds before first probe (default: 30)
  • period_seconds: Interval between probes (default: 10)
  • timeout_seconds: Probe timeout (default: 5)
  • failure_threshold: Consecutive failures before action (default: 3)

Pre-built Container Images

Deploy any Docker image:

deployment = client.deploy(
    name="nginx",
    image="nginxinc/nginx-unprivileged:alpine",
    port=8080,
    replicas=1,
    cpu="250m",
    memory="256Mi",
)

API Reference

BasilicaClient

class BasilicaClient:
    def __init__(
        self,
        base_url: str = None,  # Default: https://api.basilica.ai
        api_key: str = None,   # Default: BASILICA_API_TOKEN env var
    ): ...

deploy()

The primary method for deploying applications:

def deploy(
    name: str,                              # Deployment name (DNS-safe)
    source: Optional[str | Path] = None,    # File path or inline code
    image: str = "python:3.11-slim",        # Container image
    port: int = 8000,                       # Application port
    env: Optional[Dict[str, str]] = None,   # Environment variables
    cpu: str = "500m",                      # CPU allocation
    memory: str = "512Mi",                  # Memory allocation
    storage: Union[bool, str] = False,      # Persistent storage
    gpu_count: Optional[int] = None,        # Number of GPUs
    gpu_models: Optional[List[str]] = None, # GPU model requirements
    min_cuda_version: Optional[str] = None, # Minimum CUDA version
    min_gpu_memory_gb: Optional[int] = None,# Minimum GPU VRAM
    interconnect: Optional[str] = None,     # GPU interconnect (SXM, PCIe)
    geo: Optional[str] = None,              # Region (US, EU, CA, APAC)
    spot: Optional[bool] = None,            # Spot preference (True/False)
    infiniband: Optional[bool] = None,      # Require InfiniBand
    replicas: int = 1,                      # Number of instances
    ttl_seconds: Optional[int] = None,      # Auto-delete timeout
    public: bool = True,                    # Create public URL
    timeout: int = 300,                     # Deployment timeout
    pip_packages: Optional[List[str]] = None,  # pip dependencies
    health_check: Optional[HealthCheckConfig] = None,  # Custom health probes
) -> Deployment

Deployment Object

class Deployment:
    name: str                    # Deployment name
    url: str                     # Public URL
    namespace: str               # Kubernetes namespace
    user_id: str                 # Owner user ID
    state: str                   # Current state
    created_at: str              # Creation timestamp

    def status() -> DeploymentStatus     # Get detailed status
    def logs(tail=None) -> str           # Get application logs
    def wait_until_ready(timeout=300)    # Block until ready
    def delete() -> None                 # Delete deployment
    def refresh() -> Deployment          # Refresh state

DeploymentStatus

@dataclass
class DeploymentStatus:
    state: str                   # Pending, Active, Running, Failed, Terminating
    replicas_ready: int          # Ready replica count
    replicas_desired: int        # Desired replica count
    message: Optional[str]       # Status message
    phase: Optional[str]         # Detailed phase
    progress: Optional[ProgressInfo]  # Progress information

    @property
    def is_ready(self) -> bool   # Check if fully ready

    @property
    def is_failed(self) -> bool  # Check if failed

    @property
    def is_pending(self) -> bool # Check if still starting

Exception Handling

The SDK provides a comprehensive exception hierarchy:

from basilica import (
    BasilicaError,        # Base exception
    AuthenticationError,  # Invalid/missing token
    AuthorizationError,   # Permission denied
    ValidationError,      # Invalid parameters
    DeploymentError,      # Base deployment error
    DeploymentNotFound,   # Deployment doesn't exist
    DeploymentTimeout,    # Timeout waiting for ready
    DeploymentFailed,     # Deployment crashed
    ResourceError,        # Resource unavailable
    StorageError,         # Storage configuration error
    NetworkError,         # API communication error
    RateLimitError,       # Rate limit exceeded
    SourceError,          # Source file error
)

try:
    deployment = client.deploy(...)
except DeploymentTimeout:
    print("Deployment took too long to start")
except DeploymentFailed as e:
    print(f"Deployment failed: {e}")
except AuthenticationError:
    print("Invalid API token")

Low-Level API

For advanced use cases, access the low-level API methods:

# Create deployment with full control
response = client.create_deployment(
    instance_name="my-app",
    image="python:3.11-slim",
    command=["python", "-m", "http.server", "8000"],
    port=8000,
    cpu="1",
    memory="1Gi",
)

# Get deployment details
response = client.get_deployment("my-app")

# Delete deployment
client.delete_deployment("my-app")

# List all deployments
response = client.list_deployments()

# Get logs
logs = client.get_deployment_logs("my-app", tail=100)

GPU Offerings (Secure Cloud)

from basilica import GpuPriceQuery

# List all GPU offerings
gpus = client.list_secure_cloud_gpus()

# Filter by interconnect, region, spot
gpus = client.list_secure_cloud_gpus(
    query=GpuPriceQuery(interconnect="SXM", region="US", exclude_spot=True)
)
for g in gpus:
    print(f"{g.gpu_type} x{g.gpu_count}  {g.interconnect}  {g.region}  ${g.hourly_rate}/hr")

# Start a secure cloud GPU rental
rental = client.start_secure_cloud_rental(offering_id=gpus[0].id)
print(f"SSH: {rental.ssh_command}")

# Stop rental
client.stop_secure_cloud_rental(rental.rental_id)

GPU Rentals (Legacy API)

For direct GPU node access via SSH:

# List available nodes
nodes = client.list_nodes(gpu_type="A100", min_gpu_count=1)

# Start a rental
rental = client.start_rental(
    gpu_type="A100",
    container_image="pytorch/pytorch:latest",
)
print(f"Rental ID: {rental.rental_id}")

# Get SSH credentials
status = client.get_rental(rental.rental_id)
if status.ssh_credentials:
    print(f"SSH: {status.ssh_credentials.username}@{status.ssh_credentials.host}")

# Stop rental
client.stop_rental(rental.rental_id)

Environment Variables

Variable Description Default
BASILICA_API_TOKEN API authentication token Required
BASILICA_API_URL API endpoint URL https://api.basilica.ai

Examples

For complete working examples, see the examples directory:

Example Description
01_hello_world.py Simplest deployment with inline code
02_with_storage.py Persistent storage example
03_fastapi.py Production FastAPI deployment
04_gpu.py GPU/CUDA deployment with PyTorch
05_decorator_*.py Decorator API patterns
06_vllm_qwen.py LLM inference with vLLM
07_sglang_model.py SGLang model serving
08_external_file.py Deploy from external Python file
09_container_image.py Pre-built Docker image deployment
10_custom_docker/ Multi-file project with Dockerfile
11_agentgym.py RL environment deployment
12_lobe_chat.py Self-hosted chat UI
13_lobe_chat_vllm.py Full AI stack (LobeChat + vLLM)
14_streamlit.py Interactive Streamlit app
34_gpu_flavour_preferences.py Query GPU offerings with flavour filters
35_deploy_with_flavour.py Deploy with interconnect, region, spot preferences
36_gpu_flavour_cli/ CLI usage with flavour flags
deploy_sglang_health_check.py SGLang with custom health check probes

Development

Building from Source

git clone https://github.com/one-covenant/basilica.git
cd basilica/crates/basilica-sdk-python

# Create virtual environment
python -m venv .venv
source .venv/bin/activate

# Install maturin and build
pip install maturin
maturin develop

# Test import
python -c "from basilica import BasilicaClient; print('OK')"

Running Tests

pip install pytest
pytest tests/ -v

Architecture

The SDK is built with:

  • PyO3: Rust bindings for high-performance HTTP operations
  • Async runtime: Tokio-based async operations exposed as sync Python calls
  • Type safety: Full type hints with IDE support

For detailed architecture documentation, see PYTHON-SDK-ARCHITECTURE.md.

License

MIT OR Apache-2.0

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

basilica_sdk-0.26.0.tar.gz (690.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.2 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ ARM64

basilica_sdk-0.26.0-cp310-abi3-macosx_11_0_arm64.whl (3.4 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

basilica_sdk-0.26.0-cp310-abi3-macosx_10_12_x86_64.whl (3.6 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file basilica_sdk-0.26.0.tar.gz.

File metadata

  • Download URL: basilica_sdk-0.26.0.tar.gz
  • Upload date:
  • Size: 690.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for basilica_sdk-0.26.0.tar.gz
Algorithm Hash digest
SHA256 6d0ba387442e7ec5023eae9f819f8de7a348235017c9eeaee2dfb812c20db870
MD5 e5b0d793a0954add232cdafd78cfb744
BLAKE2b-256 04e271fedd265c5ea25b1178cf0f8a56184123f070c14b6d958719ea57955a7a

See more details on using hashes here.

Provenance

The following attestation bundles were made for basilica_sdk-0.26.0.tar.gz:

Publisher: release-python-sdk.yml on one-covenant/basilica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d93c51b9338982f1f755a569c109491c49e4c85818da3cde6d54f900bdc1592
MD5 886964c87d3911977a1b53eeec866b84
BLAKE2b-256 79008b2f0935bf883ea48f4b7bea4a7d3d08eb2ad0453f25ed461970550821fd

See more details on using hashes here.

Provenance

The following attestation bundles were made for basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release-python-sdk.yml on one-covenant/basilica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a51d07b7ced347dcd576d1c1dbb6b645350342d758667917c49715c4712dfd1b
MD5 8342546cb26f88b08354a9a00728ad56
BLAKE2b-256 ff52e5cdbad7c8fa62db89d9cd3f0bf2441961b8c60387947d6f9cb41bfaf53d

See more details on using hashes here.

Provenance

The following attestation bundles were made for basilica_sdk-0.26.0-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:

Publisher: release-python-sdk.yml on one-covenant/basilica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file basilica_sdk-0.26.0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for basilica_sdk-0.26.0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 65a3f64b6ce572f92ad9f2e6c8e53d54af052321d5b5bfb63426dce5cc8cac83
MD5 8411aedfb88cc1a4d4b0d98efd06a87c
BLAKE2b-256 6c7f972a1595fda3a0a571d59edc8dc6604757f9a4caf20277c1aed8b563d19b

See more details on using hashes here.

Provenance

The following attestation bundles were made for basilica_sdk-0.26.0-cp310-abi3-macosx_11_0_arm64.whl:

Publisher: release-python-sdk.yml on one-covenant/basilica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file basilica_sdk-0.26.0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for basilica_sdk-0.26.0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f37e7cfa2b5e52aaa8f53d55a6993679b9d64cdaea2989fe816b3df4a2878f5e
MD5 2916ae8a9e769ebdf875882cf241c0d4
BLAKE2b-256 8b01f1ede99c074306c24302096527af0c24e8cb7a000883309a2e27cb28cc0a

See more details on using hashes here.

Provenance

The following attestation bundles were made for basilica_sdk-0.26.0-cp310-abi3-macosx_10_12_x86_64.whl:

Publisher: release-python-sdk.yml on one-covenant/basilica

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page