Python SDK for deploying containerized applications on the Basilica GPU cloud
Project description
Basilica Python SDK
The official Python SDK for deploying containerized applications on the Basilica GPU cloud platform.
Installation
pip install basilica-sdk
Requirements: Python 3.10+
Quick Start
1. Get an API Token
# Install the Basilica CLI
pip install basilica-cli
# Create an API token
basilica tokens create
# Set the environment variable
export BASILICA_API_TOKEN="basilica_..."
2. Deploy Your First App
from basilica import BasilicaClient
client = BasilicaClient()
# Deploy inline Python code
deployment = client.deploy(
name="hello",
source="""
from http.server import HTTPServer, BaseHTTPRequestHandler
class Handler(BaseHTTPRequestHandler):
def do_GET(self):
self.send_response(200)
self.end_headers()
self.wfile.write(b'Hello from Basilica!')
HTTPServer(('', 8000), Handler).serve_forever()
""",
port=8000,
ttl_seconds=600, # Auto-delete after 10 minutes
)
print(f"Live at: {deployment.url}")
3. Deploy a FastAPI Application
from basilica import BasilicaClient
client = BasilicaClient()
deployment = client.deploy(
name="my-api",
source="""
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def root():
return {"message": "Hello from FastAPI!"}
@app.get("/items/{item_id}")
def get_item(item_id: int):
return {"item_id": item_id, "name": f"Item {item_id}"}
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
""",
port=8000,
pip_packages=["fastapi", "uvicorn"],
ttl_seconds=600,
)
print(f"API docs: {deployment.url}/docs")
Features
High-Level API
The SDK provides a simple deploy() method that handles:
- Source code packaging
- Container image selection
- Dependency installation
- Health checking and readiness waiting
- Public URL provisioning
deployment = client.deploy(
name="my-app", # Deployment name
source="app.py", # File path or inline code
port=8000, # Application port
pip_packages=["flask"], # Dependencies
storage=True, # Persistent storage at /data
ttl_seconds=3600, # Auto-cleanup (optional)
)
print(deployment.url) # Public URL
print(deployment.logs()) # Application logs
deployment.delete() # Manual cleanup
Decorator API
Define deployments as decorated functions:
import basilica
@basilica.deployment(
name="my-service",
port=8000,
pip_packages=["fastapi", "uvicorn"],
ttl_seconds=600,
)
def serve():
from fastapi import FastAPI
import uvicorn
app = FastAPI()
@app.get("/")
def root():
return {"status": "running"}
uvicorn.run(app, host="0.0.0.0", port=8000)
# Deploy by calling the function
deployment = serve()
print(f"Live at: {deployment.url}")
GPU Deployments
Deploy applications with GPU access:
deployment = client.deploy(
name="pytorch-inference",
source="inference.py",
image="pytorch/pytorch:2.1.0-cuda12.1-cudnn8-runtime",
port=8000,
gpu_count=1,
gpu_models=["NVIDIA-RTX-A4000"], # Optional: specific GPU models
memory="8Gi",
timeout=300,
)
GPU Flavour Preferences
Filter GPU offerings and control hardware placement with interconnect, region, and spot preferences.
Query GPU offerings
from basilica import BasilicaClient, GpuPriceQuery
client = BasilicaClient()
# SXM interconnect in the US
gpus = client.list_secure_cloud_gpus(
query=GpuPriceQuery(interconnect="SXM", region="US")
)
# Non-spot offerings only
gpus = client.list_secure_cloud_gpus(
query=GpuPriceQuery(exclude_spot=True)
)
# Spot offerings only
gpus = client.list_secure_cloud_gpus(
query=GpuPriceQuery(spot_only=True)
)
Deploy with flavour constraints
All deployment methods (deploy, create_deployment, deploy_vllm, deploy_sglang,
and the @deployment decorator) accept flavour parameters:
# create_deployment with flavour
resp = client.create_deployment(
instance_name="my-inference",
image="nginx:alpine",
port=80,
gpu_count=1,
gpu_models=["H100"],
interconnect="SXM", # SXM or PCIe
geo="US", # US, EU, CA, APAC
spot=True, # True=prefer spot, False=exclude spot
ttl_seconds=600,
)
# deploy() with flavour (waits until ready)
deployment = client.deploy(
name="gpu-app",
source="app.py",
port=8000,
gpu_count=1,
gpu_models=["H100"],
interconnect="SXM",
geo="US",
spot=False,
)
# deploy_vllm with flavour
deployment = client.deploy_vllm(
model="Qwen/Qwen3-0.6B",
interconnect="SXM",
geo="US",
spot=True,
)
# Decorator API with flavour
@basilica.deployment(
name="my-service",
port=8000,
gpu_count=1,
interconnect="SXM",
geo="US",
spot=True,
)
def serve():
...
| Parameter | Description |
|---|---|
interconnect |
GPU interconnect type: "SXM" or "PCIe" |
geo |
Geographic region: "US", "EU", "CA", "APAC" |
spot |
True = prefer spot instances, False = exclude spot |
infiniband |
True = require InfiniBand networking |
Persistent Storage
Enable persistent storage mounted at /data:
# Simple: Enable storage at /data
deployment = client.deploy(
name="stateful-app",
source="app.py",
port=8000,
storage=True,
)
# Custom mount path
deployment = client.deploy(
name="stateful-app",
source="app.py",
port=8000,
storage="/custom/path",
)
Using volumes with the decorator API:
import basilica
cache = basilica.Volume.from_name("my-cache", create_if_missing=True)
@basilica.deployment(
name="app-with-storage",
port=8000,
volumes={"/data": cache},
)
def serve():
# Your app can read/write to /data
pass
Health Checks
Large model deployments (vLLM, SGLang) can take minutes to download and load into GPU memory. Configure custom health check probes to prevent Kubernetes from killing pods before models are ready:
from basilica import BasilicaClient, HealthCheckConfig, ProbeConfig
client = BasilicaClient()
# Deploy SGLang with a 20-minute startup tolerance
deployment = client.deploy_sglang(
model="Qwen/Qwen2.5-3B-Instruct",
health_check=HealthCheckConfig(
startup=ProbeConfig(
path="/health",
port=30000,
initial_delay_seconds=0,
period_seconds=10,
timeout_seconds=5,
failure_threshold=120, # 120 * 10s = 20 minutes
),
liveness=ProbeConfig(
path="/health",
port=30000,
initial_delay_seconds=120,
period_seconds=30,
timeout_seconds=10,
failure_threshold=5,
),
readiness=ProbeConfig(
path="/health",
port=30000,
initial_delay_seconds=60,
period_seconds=15,
timeout_seconds=10,
failure_threshold=5,
),
),
timeout=1200,
)
deploy_vllm() and deploy_sglang() include sensible defaults (10-minute startup
tolerance) when no health_check is provided. For very large models, pass your own
HealthCheckConfig to extend the startup window.
Health checks work with any deployment method:
# Generic deploy()
deployment = client.deploy(
name="my-gpu-app",
source="app.py",
port=8000,
gpu_count=1,
health_check=HealthCheckConfig(
startup=ProbeConfig(path="/ready", port=8000, failure_threshold=60),
),
)
# Decorator API
@basilica.deployment(
name="my-service",
port=8000,
health_check=HealthCheckConfig(
startup=ProbeConfig(path="/health", port=8000, failure_threshold=60),
),
)
def serve():
...
ProbeConfig fields:
path: HTTP endpoint to probe (e.g."/health")port: Port to probe (defaults to container port ifNone)initial_delay_seconds: Seconds before first probe (default: 30)period_seconds: Interval between probes (default: 10)timeout_seconds: Probe timeout (default: 5)failure_threshold: Consecutive failures before action (default: 3)
Pre-built Container Images
Deploy any Docker image:
deployment = client.deploy(
name="nginx",
image="nginxinc/nginx-unprivileged:alpine",
port=8080,
replicas=1,
cpu="250m",
memory="256Mi",
)
Distributed Training (NCCL collectives)
For torch.distributed workloads (DDP, DiLoCo, FSDP, any NCCL-collective workload),
use the @basilica.distributed decorator. The platform provisions GPU spokes,
wires the WireGuard mesh, runs torchrun with the standard RANK / WORLD_SIZE
/ MASTER_* env-var contract, and tears down on context exit.
This is a separate surface from @basilica.deployment because the lifecycle
shape is different: a single function body fans out to N rank pods under
torchrun, the user reads os.environ["RANK"] instead of branching on rank in
SDK code, and the handle exposes scale(), wait_until_*, bench, and
rank_exits instead of url and replicas.
Canonical surface: @basilica.distributed decorator
import basilica
from basilica import ProviderFilter, WorldSize
@basilica.distributed(
name="dlc-hello",
image="ghcr.io/one-covenant/basilica/basilica-distributed-trainer:latest",
world_size=WorldSize(min=2, target=2, max=2),
gpu_count=1,
gpu_models=["A100"],
min_gpu_memory_gb=40,
cpu="8",
memory="32Gi",
provider_filter=ProviderFilter(include=["hyperstack", "verda"]),
topology_spread="pack",
bench=True,
nccl_env={"NCCL_DEBUG": "WARN"},
ttl_seconds=900,
)
def train() -> None:
import os
import torch
import torch.distributed as dist
rank = int(os.environ["RANK"])
world_size = int(os.environ["WORLD_SIZE"])
local_rank = int(os.environ.get("LOCAL_RANK", 0))
dist.init_process_group(backend="nccl")
device = torch.device(f"cuda:{local_rank}")
x = torch.ones(1024, device=device)
dist.all_reduce(x)
if rank == 0:
print(f"world={world_size} sum={x.sum().item():.0f}", flush=True)
dist.destroy_process_group()
if __name__ == "__main__":
with train() as training:
print(f"Deployed: {training.name}")
training.wait_until_complete(timeout=1800)
if training.bench is not None:
print(f"Measured busbw: {training.bench.busbw_gbps_p50} Gbps")
Calling the decorated function returns a DistributedTraining context-manager.
Use it bare (train()) for fire-and-forget with auto-cleanup; use with train() as training: for mid-run orchestration (scale(), wait_until_*,
logs(), bench).
BYO launcher: basilica.distributed(command=[...]) factory
When you want to drive torchrun (or mpirun, or accelerate) yourself, pass
command=[...] instead of decorating a function. The same basilica.distributed
symbol becomes a factory and returns a DistributedTraining directly:
import basilica
from basilica import ProviderFilter, WorldSize
training = basilica.distributed(
name="dlc-torchrun",
image="ghcr.io/one-covenant/basilica/basilica-distributed-trainer:latest",
command=[
"torchrun",
"--rdzv-backend=etcd",
"--rdzv-endpoint=$BASILICA_RDZV_ENDPOINT",
"--rdzv-id=$BASILICA_RDZV_ID",
"--nnodes=$BASILICA_WORLD_TARGET",
"--nproc-per-node=$BASILICA_GPUS_PER_POD",
"/workspace/all_reduce_smoke.py",
],
world_size=WorldSize(min=2, target=2, max=4),
gpu_count=1,
gpu_models=["A100"],
provider_filter=ProviderFilter(include=["hyperstack", "verda"]),
topology_spread="pack",
ttl_seconds=900,
)
with training:
training.scale(target=3)
training.wait_until_target_world(timeout=300)
print(training.logs(tail=30))
Same DistributedTraining handle as the decorator path. Same context-manager
cleanup. The only difference is command=[...] replaces the decorated function
body. Inside the launcher, expand $BASILICA_RDZV_ENDPOINT / $BASILICA_RANK /
$BASILICA_WORLD_TARGET / $BASILICA_GPUS_PER_POD — the operator injects
these into the worker pod environment.
Lifecycle on the DistributedTraining handle
| Method | What it does |
|---|---|
training.world |
WorldStatus(ready, target, min, max, below_minimum) snapshot |
training.phase |
Operator-driven phase: pending / ready / succeeded / failed / cancelled |
training.ranks |
Per-rank pod observations (provider, region, phase, restarts) |
training.rank_exits |
Per-rank exit diagnostics, populated after the UD reaches a terminal state |
training.bench |
BenchResult | None — None means "no measurement", regardless of why |
training.bench_diagnostics |
Debug detail when training.bench is None; rarely needed |
training.scale(target=N) |
Patch worldSize.target mid-run; ranks join/drain asynchronously |
training.wait_until_min_world(timeout=...) |
Block until ready >= min or raise BelowMinimumWorld |
training.wait_until_target_world(timeout=...) |
Block until ready >= target |
training.wait_until_complete(timeout=...) |
Block until terminal phase (succeeded / failed / cancelled) |
training.logs(tail=..., follow=...) |
Stream merged-rank logs |
training.delete() |
Explicit teardown; __exit__ runs this on scope exit |
Every method has an _async counterpart (training.scale_async(...),
async with training: ...).
bench=True — the diagnostic for "is my code slow or is the network slow?"
NCCL collective bandwidth varies with provider, region, GPU model, and current mesh load. Without a measurement, the user has to guess whether slow training is their code or the network.
bench=True opts in to a 2-rank all_reduce_perf probe that runs alongside
your workers on the same provider mesh with the same WireGuard transport. The
result lands on training.bench after the UD reaches a terminal state:
import basilica
from basilica import ProviderFilter, WorldSize
@basilica.distributed(
name="dlc-with-bench",
image="ghcr.io/one-covenant/basilica/basilica-distributed-trainer:latest",
world_size=WorldSize(min=2, target=2, max=2),
gpu_count=1,
gpu_models=["A100"],
provider_filter=ProviderFilter(include=["hyperstack", "verda"]),
topology_spread="pack",
bench=True,
)
def train() -> None:
import torch.distributed as dist
dist.init_process_group(backend="nccl")
dist.destroy_process_group()
with train() as training:
training.wait_until_complete(timeout=1800)
if training.bench is not None:
print(f"busbw_gbps_p50: {training.bench.busbw_gbps_p50}")
print(f"latency_us_at_1mib: {training.bench.latency_us_at_1mib}")
print(
f"probe pair: {training.bench.probe_node_a} "
f"<-> {training.bench.probe_node_b}"
)
else:
# No measurement -- workers too short, probe couldn't co-schedule, etc.
print(training.bench_diagnostics)
None means "no measurement" regardless of reason. training.bench_diagnostics
exists for the rare case where you want to know WHY the probe didn't land a
result. Default bench=False because the probe costs 2 extra GPU ranks for its
duration — users who don't need the measurement shouldn't pay for it.
Running an external script
The canonical input is a Callable (what the decorator captures). For an
external .py file shipped in the trainer image, wrap it in a decorated
function:
import basilica
from basilica import ProviderFilter, WorldSize
@basilica.distributed(
name="dlc-script",
image="ghcr.io/one-covenant/basilica/basilica-distributed-trainer:latest",
world_size=WorldSize(min=2, target=2, max=2),
gpu_count=1,
gpu_models=["A100"],
provider_filter=ProviderFilter(include=["hyperstack", "verda"]),
topology_spread="pack",
)
def run_script() -> None:
import runpy
runpy.run_path("/workspace/my_training.py", run_name="__main__")
Migration from the legacy surface
These surfaces emitted DeprecationWarning in 0.29.5-0.29.7 and were
REMOVED in 0.30.0 (SDK-S7, basilica-backend issue 666). If you upgraded
from 0.29.x and see AttributeError / ImportError / ValidationError
on a name below, apply the canonical mapping in the right column.
| Removed in 0.30.0 | Canonical replacement | Removed in | Plan ticket |
|---|---|---|---|
client.deploy_distributed(...) |
@basilica.distributed(...) on a function, call it |
0.30.0 | S1+S7 |
client.deploy_distributed_managed(...) |
with train() as training: on the decorated function |
0.30.0 | S1+S7 |
bench="on-start" / bench="off" (str) |
bench=True / bench=False (bool) |
0.30.0 | S2+S7 |
training.bench_status (property) |
training.bench (result) + training.bench_diagnostics (debug dict) |
0.30.0 | S2+S7 |
training.wait_until_bench_complete(...) |
with train() as training: (auto-blocks) + read training.bench |
0.30.0 | S2+S7 |
BenchStatus re-export from basilica |
BenchResult + bench_diagnostics dict |
0.30.0 | S2+S7 |
client.deploy_distributed_managed(command=[...]) |
basilica.distributed(command=[...]) factory |
0.30.0 | S3+S7 |
source=Path("./train.py") / source="<inline code>" |
decorate a function (Callable shape) | 0.30.0 | S4+S7 |
DistributedTrainingManaged / DistributedTrainingManagedAsync |
DistributedTraining (itself context-manager-able) |
0.30.0 | S1+S7 |
The decorator (@basilica.distributed) is the ONE canonical surface and
routes through the private BasilicaClient._deploy_distributed_impl /
_deploy_distributed_impl_async methods. There is no public
deploy_distributed* method on BasilicaClient in 0.30.0+.
Worked examples in this repo
| Example | Pattern |
|---|---|
examples/20_distributed_diloco.py |
@basilica.distributed decorator + bench=True + DiLoCo (NCCL all-reduce in the outer step) |
examples/21_distributed_torchrun.py |
basilica.distributed(command=[torchrun ...]) factory + mid-run scale() |
examples/22_distributed_with_bench.py |
@basilica.distributed decorator + bench-result inspection + JSON dump |
API Reference
BasilicaClient
class BasilicaClient:
def __init__(
self,
base_url: str = None, # Default: https://api.basilica.ai
api_key: str = None, # Default: BASILICA_API_TOKEN env var
): ...
deploy()
The primary method for deploying applications:
def deploy(
name: str, # Deployment name (DNS-safe)
source: Optional[str | Path] = None, # File path or inline code
image: str = "python:3.11-slim", # Container image
port: int = 8000, # Application port
env: Optional[Dict[str, str]] = None, # Environment variables
cpu: str = "500m", # CPU allocation
memory: str = "512Mi", # Memory allocation
storage: Union[bool, str] = False, # Persistent storage
gpu_count: Optional[int] = None, # Number of GPUs
gpu_models: Optional[List[str]] = None, # GPU model requirements
min_cuda_version: Optional[str] = None, # Minimum CUDA version
min_gpu_memory_gb: Optional[int] = None,# Minimum GPU VRAM
interconnect: Optional[str] = None, # GPU interconnect (SXM, PCIe)
geo: Optional[str] = None, # Region (US, EU, CA, APAC)
spot: Optional[bool] = None, # Spot preference (True/False)
infiniband: Optional[bool] = None, # Require InfiniBand
replicas: int = 1, # Number of instances
ttl_seconds: Optional[int] = None, # Auto-delete timeout
public: bool = True, # Create public URL
timeout: int = 300, # Deployment timeout
pip_packages: Optional[List[str]] = None, # pip dependencies
health_check: Optional[HealthCheckConfig] = None, # Custom health probes
) -> Deployment
Deployment Object
class Deployment:
name: str # Deployment name
url: str # Public URL
namespace: str # Kubernetes namespace
user_id: str # Owner user ID
state: str # Current state
created_at: str # Creation timestamp
def status() -> DeploymentStatus # Get detailed status
def logs(tail=None) -> str # Get application logs
def wait_until_ready(timeout=300) # Block until ready
def delete() -> None # Delete deployment
def refresh() -> Deployment # Refresh state
DeploymentStatus
@dataclass
class DeploymentStatus:
state: str # Pending, Active, Running, Failed, Terminating
replicas_ready: int # Ready replica count
replicas_desired: int # Desired replica count
message: Optional[str] # Status message
phase: Optional[str] # Detailed phase
progress: Optional[ProgressInfo] # Progress information
@property
def is_ready(self) -> bool # Check if fully ready
@property
def is_failed(self) -> bool # Check if failed
@property
def is_pending(self) -> bool # Check if still starting
Distributed Training Types
The distributed-training surface lives in basilica (re-exported from
basilica.distributed). The canonical entry point is the
@basilica.distributed decorator; basilica.distributed(command=[...]) is
the factory shape for BYO launchers.
@dataclass(frozen=True)
class WorldSize:
"""Worker-rank triple. `1 <= min <= target <= max`."""
min: int # below this the run pauses (torchelastic MIN_NODES)
target: int # steady-state replica count
max: int # hard ceiling for scale()
@dataclass(frozen=True)
class ProviderFilter:
"""Inclusive/exclusive cloud-provider filter for worker scheduling.
Match is against the `basilica.ai/provider` node label
(`hyperstack`, `verda`, `masscompute`, `shadeform`). With
`topology_spread="pack"`, the autoscaler picks ONE provider from the
include-list and all workers pack on it.
"""
include: List[str] = []
exclude: List[str] = []
@dataclass(frozen=True)
class BenchResult:
"""Per-UD NCCL probe measurement. Read via `training.bench`.
Bandwidth fields are GB/s (1 GB = 10^9 bytes); latency is at the
smallest swept message size. `None`-valued fields mean the probe
could not measure them (partial sweep).
"""
measured_at: datetime
busbw_gbps_p10: Optional[float]
busbw_gbps_p50: Optional[float]
busbw_gbps_p90: Optional[float]
algbw_gbps_p50: Optional[float]
latency_us_at_1mib: Optional[float]
size_bytes_swept: List[int]
probe_node_a: str
probe_node_b: str
class DistributedTraining:
"""Handle returned by `@basilica.distributed` and `basilica.distributed(command=...)`.
Itself a context manager: `with train() as training:` blocks on the
decorator call, yields the handle, and best-effort `delete()`s on
scope exit (success OR exception).
"""
name: str
namespace: str
rendezvous_endpoint: str
# observation (lazy refresh on first access)
world: WorldStatus
phase: str # operator-driven phase
is_terminal: bool
ranks: List[RankStatus]
rank_exits: List[RankExit]
bench: Optional[BenchResult]
bench_diagnostics: Optional[Dict[str, Any]]
# lifecycle
def scale(target: int) -> WorldStatus: ...
def wait_until_min_world(timeout: int = 300) -> None: ...
def wait_until_target_world(timeout: int = 600) -> None: ...
def wait_until_complete(timeout: int = 1800) -> WorldStatus: ...
def logs(tail=None, follow=False, rank=None) -> str: ...
def delete() -> None: ...
# context manager
def __enter__(self) -> "DistributedTraining": ...
def __exit__(self, exc_type, exc_val, exc_tb) -> None: ...
# Every method has an `_async` counterpart.
WorldStatus, RankStatus, RankExit, DistributedMetrics, and the
distributed-specific exceptions (BelowMinimumWorld, WorldSizeOutOfBounds,
RendezvousUnavailable, UDTerminalState, QuotaExceeded, DistributedError)
are exported from basilica directly.
Exception Handling
The SDK provides a comprehensive exception hierarchy:
from basilica import (
BasilicaError, # Base exception
AuthenticationError, # Invalid/missing token
AuthorizationError, # Permission denied
ValidationError, # Invalid parameters
DeploymentError, # Base deployment error
DeploymentNotFound, # Deployment doesn't exist
DeploymentTimeout, # Timeout waiting for ready
DeploymentFailed, # Deployment crashed
ResourceError, # Resource unavailable
StorageError, # Storage configuration error
NetworkError, # API communication error
RateLimitError, # Rate limit exceeded
SourceError, # Source file error
# Distributed-training specific:
DistributedError, # Base distributed-training error
BelowMinimumWorld, # `ready < min` (timeout or terminal failure)
WorldSizeOutOfBounds, # WorldSize triple or scale() target invalid
RendezvousUnavailable, # etcd rendezvous service unreachable
UDTerminalState, # UD is already terminal (e.g. scale on succeeded)
QuotaExceeded, # Namespace rank budget exceeded
)
try:
deployment = client.deploy(...)
except DeploymentTimeout:
print("Deployment took too long to start")
except DeploymentFailed as e:
print(f"Deployment failed: {e}")
except AuthenticationError:
print("Invalid API token")
Low-Level API
For advanced use cases, access the low-level API methods:
# Create deployment with full control
response = client.create_deployment(
instance_name="my-app",
image="python:3.11-slim",
command=["python", "-m", "http.server", "8000"],
port=8000,
cpu="1",
memory="1Gi",
)
# Get deployment details
response = client.get_deployment("my-app")
# Delete deployment
client.delete_deployment("my-app")
# List all deployments
response = client.list_deployments()
# Get logs
logs = client.get_deployment_logs("my-app", tail=100)
GPU Offerings (Secure Cloud)
from basilica import GpuPriceQuery
# List all GPU offerings
gpus = client.list_secure_cloud_gpus()
# Filter by interconnect, region, spot
gpus = client.list_secure_cloud_gpus(
query=GpuPriceQuery(interconnect="SXM", region="US", exclude_spot=True)
)
for g in gpus:
print(f"{g.gpu_type} x{g.gpu_count} {g.interconnect} {g.region} ${g.hourly_rate}/hr")
# Start a secure cloud GPU rental
rental = client.start_secure_cloud_rental(offering_id=gpus[0].id)
print(f"SSH: {rental.ssh_command}")
# Stop rental
client.stop_secure_cloud_rental(rental.rental_id)
GPU Rentals (Legacy API)
For direct GPU node access via SSH:
# List available nodes
nodes = client.list_nodes(gpu_type="A100", min_gpu_count=1)
# Start a rental
rental = client.start_rental(
gpu_type="A100",
container_image="pytorch/pytorch:latest",
)
print(f"Rental ID: {rental.rental_id}")
# Get SSH credentials
status = client.get_rental(rental.rental_id)
if status.ssh_credentials:
print(f"SSH: {status.ssh_credentials.username}@{status.ssh_credentials.host}")
# Stop rental
client.stop_rental(rental.rental_id)
Environment Variables
| Variable | Description | Default |
|---|---|---|
BASILICA_API_TOKEN |
API authentication token | Required |
BASILICA_API_URL |
API endpoint URL | https://api.basilica.ai |
Examples
For complete working examples, see the examples directory:
| Example | Description |
|---|---|
01_hello_world.py |
Simplest deployment with inline code |
02_with_storage.py |
Persistent storage example |
03_fastapi.py |
Production FastAPI deployment |
04_gpu.py |
GPU/CUDA deployment with PyTorch |
05_decorator_*.py |
Decorator API patterns |
06_vllm_qwen.py |
LLM inference with vLLM |
07_sglang_model.py |
SGLang model serving |
08_external_file.py |
Deploy from external Python file |
09_container_image.py |
Pre-built Docker image deployment |
10_custom_docker/ |
Multi-file project with Dockerfile |
11_agentgym.py |
RL environment deployment |
12_lobe_chat.py |
Self-hosted chat UI |
13_lobe_chat_vllm.py |
Full AI stack (LobeChat + vLLM) |
14_streamlit.py |
Interactive Streamlit app |
20_distributed_diloco.py |
@basilica.distributed + DiLoCo NCCL training |
21_distributed_torchrun.py |
BYO command=[torchrun ...] factory + mid-run scale |
22_distributed_with_bench.py |
bench=True + reading training.bench |
34_gpu_flavour_preferences.py |
Query GPU offerings with flavour filters |
35_deploy_with_flavour.py |
Deploy with interconnect, region, spot preferences |
36_gpu_flavour_cli/ |
CLI usage with flavour flags |
deploy_sglang_health_check.py |
SGLang with custom health check probes |
Development
Building from Source
git clone https://github.com/one-covenant/basilica.git
cd basilica/crates/basilica-sdk-python
# Create virtual environment
python -m venv .venv
source .venv/bin/activate
# Install maturin and build
pip install maturin
maturin develop
# Test import
python -c "from basilica import BasilicaClient; print('OK')"
Running Tests
pip install pytest
pytest tests/ -v
Architecture
The SDK is built with:
- PyO3: Rust bindings for high-performance HTTP operations
- Async runtime: Tokio-based async operations exposed as sync Python calls
- Type safety: Full type hints with IDE support
For detailed architecture documentation, see PYTHON-SDK-ARCHITECTURE.md.
License
MIT OR Apache-2.0
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file basilica_sdk-0.31.2.tar.gz.
File metadata
- Download URL: basilica_sdk-0.31.2.tar.gz
- Upload date:
- Size: 768.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41cd755249efb9bcc5a03a9a371a64d13bc9f5ff9d4d9462244cba1afec8794d
|
|
| MD5 |
9576e70498a1e60d555d5e622c3581da
|
|
| BLAKE2b-256 |
a0dcfc94b1d34ff7559afabe626db616f3e3ef3c7a2c83dd7ca7ec00e9c09970
|
Provenance
The following attestation bundles were made for basilica_sdk-0.31.2.tar.gz:
Publisher:
release-python-sdk.yml on one-covenant/basilica
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
basilica_sdk-0.31.2.tar.gz -
Subject digest:
41cd755249efb9bcc5a03a9a371a64d13bc9f5ff9d4d9462244cba1afec8794d - Sigstore transparency entry: 1765482581
- Sigstore integration time:
-
Permalink:
one-covenant/basilica@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Branch / Tag:
refs/tags/basilica-sdk-python-v0.31.2 - Owner: https://github.com/one-covenant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python-sdk.yml@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Trigger Event:
push
-
Statement type:
File details
Details for the file basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 4.5 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2a01f18f38471b44550823389539e8b8e077992ef35a2efd2234ccf6f9380aa
|
|
| MD5 |
9f62d76f5b718b73fa333dde74f9e54a
|
|
| BLAKE2b-256 |
71248399b90682a7628e4bbbb1d789155973348dcdef2105a27beaa4efb73174
|
Provenance
The following attestation bundles were made for basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:
Publisher:
release-python-sdk.yml on one-covenant/basilica
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl -
Subject digest:
a2a01f18f38471b44550823389539e8b8e077992ef35a2efd2234ccf6f9380aa - Sigstore transparency entry: 1765484138
- Sigstore integration time:
-
Permalink:
one-covenant/basilica@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Branch / Tag:
refs/tags/basilica-sdk-python-v0.31.2 - Owner: https://github.com/one-covenant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python-sdk.yml@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Trigger Event:
push
-
Statement type:
File details
Details for the file basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.
File metadata
- Download URL: basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
- Upload date:
- Size: 3.8 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4a8f1d428a3d3d79409c7afc92520d3f7b24a42b030957e731bba4373e8f1e38
|
|
| MD5 |
72bb02a0ff6c15e7ebbe83ca8c6f7865
|
|
| BLAKE2b-256 |
5cceae78889900281ddb15201c1efeeda7e1bd6bd384523f4a190191297307fa
|
Provenance
The following attestation bundles were made for basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl:
Publisher:
release-python-sdk.yml on one-covenant/basilica
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
basilica_sdk-0.31.2-cp310-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl -
Subject digest:
4a8f1d428a3d3d79409c7afc92520d3f7b24a42b030957e731bba4373e8f1e38 - Sigstore transparency entry: 1765483170
- Sigstore integration time:
-
Permalink:
one-covenant/basilica@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Branch / Tag:
refs/tags/basilica-sdk-python-v0.31.2 - Owner: https://github.com/one-covenant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python-sdk.yml@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Trigger Event:
push
-
Statement type:
File details
Details for the file basilica_sdk-0.31.2-cp310-abi3-macosx_11_0_arm64.whl.
File metadata
- Download URL: basilica_sdk-0.31.2-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 3.6 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a783249493e3c8cb58aaeac39bd8a598a991b3a469cf1ac88e2850ff79b0fa9e
|
|
| MD5 |
82af010265cec7c852d9c2fea151fe5c
|
|
| BLAKE2b-256 |
ff534336094bbf3f34b762cf7d7cc289954b30db6116ae60110e94e2be173e68
|
Provenance
The following attestation bundles were made for basilica_sdk-0.31.2-cp310-abi3-macosx_11_0_arm64.whl:
Publisher:
release-python-sdk.yml on one-covenant/basilica
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
basilica_sdk-0.31.2-cp310-abi3-macosx_11_0_arm64.whl -
Subject digest:
a783249493e3c8cb58aaeac39bd8a598a991b3a469cf1ac88e2850ff79b0fa9e - Sigstore transparency entry: 1765484726
- Sigstore integration time:
-
Permalink:
one-covenant/basilica@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Branch / Tag:
refs/tags/basilica-sdk-python-v0.31.2 - Owner: https://github.com/one-covenant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python-sdk.yml@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Trigger Event:
push
-
Statement type:
File details
Details for the file basilica_sdk-0.31.2-cp310-abi3-macosx_10_12_x86_64.whl.
File metadata
- Download URL: basilica_sdk-0.31.2-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 3.8 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00c95d95a077c0765ec21b52aff3aa98489afb5002d1b1efe6279a3517a773e1
|
|
| MD5 |
4ebfafcd47c4770a571b20e107d4cf5e
|
|
| BLAKE2b-256 |
97c48354aaf48b4c846e802f0e5c7d782dd3f9f839c69313a827b54da73da12b
|
Provenance
The following attestation bundles were made for basilica_sdk-0.31.2-cp310-abi3-macosx_10_12_x86_64.whl:
Publisher:
release-python-sdk.yml on one-covenant/basilica
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
basilica_sdk-0.31.2-cp310-abi3-macosx_10_12_x86_64.whl -
Subject digest:
00c95d95a077c0765ec21b52aff3aa98489afb5002d1b1efe6279a3517a773e1 - Sigstore transparency entry: 1765483358
- Sigstore integration time:
-
Permalink:
one-covenant/basilica@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Branch / Tag:
refs/tags/basilica-sdk-python-v0.31.2 - Owner: https://github.com/one-covenant
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-python-sdk.yml@b5aad611ee73bccdc1f4711af4ceedc6e8ece631 -
Trigger Event:
push
-
Statement type: