Execute Python and Shell code on remote GPU sessions
Project description
Clouditia SDK
Execute Python and Shell code on remote session sessions.
Clouditia SDK provides a simple Python interface to run code on remote session-powered containers. Perfect for machine learning, deep learning, and any GPU-accelerated workloads.
Installation
pip install clouditia
# With S3 support for saving outputs
pip install clouditia[s3]
Quick Start
from clouditia import GPUSession
# Connect to your GPU session
session_live_gpu = GPUSession("ck_your_api_key")
# Execute Python code on the remote session
result = session_live_gpu.run("""
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
""")
print(result.output)
Features
- Python Execution: Run Python code on remote sessions
- Shell Commands: Execute shell commands on the remote session pod
- Persistent Sessions: Keep variables between executions with
start()/stop() - Variable Transfer: Send and retrieve variables between local and remote
- File Transfer: Upload/download files and folders between local and remote
- S3 Output: Save outputs directly to S3 buckets
- Async Jobs: Submit long-running tasks with real-time log monitoring
- Jupyter Magic: Use
%%clouditiamagic in notebooks - Decorator Support: Use
@session_live_gpu.remoteto run functions on the remote session
Table of Contents
- Getting Your API Key
- Basic Usage
- Persistent Sessions
- Executing Python Code
- Shell Commands
- Variable Transfer
- File Transfer
- S3 Output
- Remote Functions (Decorator)
- Async Jobs (Long-Running Tasks)
- Jupyter Magic
- Error Handling
- API Reference
Getting Your API Key
- Log in to clouditia.com
- Start a GPU session
- Go to API Keys in your session dashboard
- Generate a new API key (starts with
ck_orsk_)
Basic Usage
Connect to a Session
from clouditia import GPUSession
# Create a session with your API key
session_live_gpu = GPUSession("ck_your_api_key_here")
# Verify the connection
info = session_live_gpu.verify()
print(f"Connected to: {info['session_name']}")
print(f"GPU: {info['gpu_type']}")
print(f"Credit remaining: {info['user_credit']}€")
Waiting for a Session to Be Ready
A GPU session can have several intermediate states before being fully usable:
creating: the pod is being scheduled on a compute node.runningbut workspace still downloading: when the session is resumed from a custom environment (venv), the workspace (models, datasets, caches…) is streamed back from S3 at pod startup. For a small workspace this takes a few seconds, for a vLLM cache with 70,000+ files and 16 GB of data it can take 10+ minutes.
The SDK exposes two fields to handle this:
ready: bool—Trueonly when the session is fully usable (statusrunningAND any workspace download complete).estimated_ready_in_seconds: int | None— ETA until ready.workspace_sync— live progress of the workspace download:{in_progress, bytes_done, bytes_total, files_done, pct, rate_bps, eta_seconds}.
Quick check
info = session_live_gpu.verify()
if info['ready']:
print("Session is ready!")
else:
ws = info.get('workspace_sync') or {}
if ws.get('in_progress'):
print(f"Workspace: {ws['pct']}% ({ws['bytes_done']}/{ws['bytes_total']} bytes)")
print(f"ETA: {info['estimated_ready_in_seconds']} seconds")
else:
print(f"Waiting — status={info['status']}")
Blocking helper: wait_until_ready()
The cleanest way to wait for a resumed session is to call
wait_until_ready(). It polls verify() every few seconds and prints a
live progress line until the session is ready (or timeout).
session_live_gpu = GPUSession("ck_your_api_key")
# Block until the workspace is fully restored and VS Code/Jupyter is up
if session_live_gpu.wait_until_ready(timeout=1200): # wait max 20 min
# Safe to run code now
result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
else:
print("Session failed to become ready in time")
Output during a typical vLLM resume:
⏳ Workspace: 2.34/16.23 GB (14%) @ 22.1 MB/s — ETA 10min 32s
⏳ Workspace: 3.12/16.23 GB (19%) @ 22.5 MB/s — ETA 9min 41s
⏳ Workspace: 4.01/16.23 GB (25%) @ 22.3 MB/s — ETA 9min 5s
...
✅ Session ready!
Parameters:
timeout: int = 1800— max total wait time in seconds (default 30 min).poll_interval: int = 5— delay between polls in seconds.verbose: bool = True— print progress updates to stdout.
Persistent Sessions
By default, each run() call executes in an isolated environment - variables don't persist between calls. Use start() and stop() to enable persistent sessions where variables are preserved.
Isolated Mode (Default)
# Without start(), variables are NOT persistent
session_live_gpu.run("x = 10")
session_live_gpu.run("print(x)") # Error: x is not defined
Persistent Mode
# Start a persistent session
session_live_gpu.start()
print(f"Session active: {session_live_gpu.is_persistent}") # True
# Variables now persist between run() calls
session_live_gpu.run("x = 10")
session_live_gpu.run("y = 20")
session_live_gpu.run("z = x + y")
result = session_live_gpu.run("print(f'Result: {z}')")
# Output: Result: 30
# Stop the session when done
session_live_gpu.stop()
print(f"Session active: {session_live_gpu.is_persistent}") # False
Full Example
from clouditia import GPUSession
session_live_gpu = GPUSession("ck_your_api_key")
# Start persistent session
session_live_gpu.start()
# Build up state across multiple calls
session_live_gpu.run("import torch")
session_live_gpu.run("model = torch.nn.Linear(10, 5).cuda()")
session_live_gpu.run("data = torch.randn(32, 10).cuda()")
# Use the accumulated state
result = session_live_gpu.run("""
output = model(data)
print(f"Input shape: {data.shape}")
print(f"Output shape: {output.shape}")
""")
# Clean up
session_live_gpu.stop()
Checking Session State
# Check if a persistent session is active
if session_live_gpu.is_persistent:
print("Persistent session is running")
else:
print("Running in isolated mode")
Executing Python Code
Simple Execution
# Run Python code and get the output
result = session_live_gpu.run("print('Hello from the GPU!')")
print(result.output) # "Hello from the GPU!"
# Check if execution was successful
if result.success:
print("Code executed successfully!")
else:
print(f"Error: {result.error}")
output vs result
result.output— contains all output from the executed code (print statements + last expression), like a Jupyter cellresult.result— contains only the value of the last line if it's an expression (for programmatic use)
# Expression seule
result = session_live_gpu.run("2 + 2")
print(result.output) # "4"
print(result.result) # "4"
# List comprehension
result = session_live_gpu.run("[i**2 for i in range(5)]")
print(result.output) # "[0, 1, 4, 9, 16]"
print(result.result) # "[0, 1, 4, 9, 16]"
# print() + expression : output contient tout, result contient la derniere valeur
result = session_live_gpu.run("x = 10\nprint(f'x = {x}')\nx * 2")
print(result.output) # "x = 10\n20"
print(result.result) # "20"
# Statements seuls (pas d'expression en derniere ligne)
result = session_live_gpu.run("print('hello')")
print(result.output) # "hello"
print(result.result) # None
Multi-line Code
result = session_live_gpu.run("""
import torch
import torch.nn as nn
# Create a simple model
model = nn.Linear(10, 5).cuda()
x = torch.randn(32, 10).cuda()
output = model(x)
print(f"Input shape: {x.shape}")
print(f"Output shape: {output.shape}")
print(f"Model parameters: {sum(p.numel() for p in model.parameters())}")
""")
print(result.output)
run() vs exec()
run()— retourne unExecutionResultavecoutput,result,success. Vous gerez les erreurs vous-memeexec()— returnsTrueif OK, raises an exceptionExecutionErrorif the code fails. Shortcut for code that doesn't need a return value.
Both execute code the same way. The only difference is error handling.
Important: Each run() or exec() call is isolated — variables don't persist between calls. To persist variables, use persistent mode (see Persistent Sessions section):
# ERREUR: chaque exec() est isole, torch n'est pas connu au 2e appel
session_live_gpu.exec("import torch")
session_live_gpu.exec("model = torch.nn.Linear(10, 5).cuda()") # NameError!
# CORRECT: tout dans un seul appel
session_live_gpu.exec("""
import torch
model = torch.nn.Linear(10, 5).cuda()
optimizer = torch.optim.Adam(model.parameters())
print(f"Model parameters: {sum(p.numel() for p in model.parameters())}")
""")
# CORRECT: ou utiliser le mode persistent
session_live_gpu.start() # Active le mode persistent
session_live_gpu.exec("import torch")
session_live_gpu.exec("model = torch.nn.Linear(10, 5).cuda()") # torch est connu
session_live_gpu.exec("optimizer = torch.optim.Adam(model.parameters())")
session_live_gpu.stop()
Shell Commands
Execute shell commands on the remote session pod:
# Check current directory
result = session_live_gpu.shell("pwd")
print(result.output) # /home/coder/workspace
# List files (chemin complet ou ~/workspace)
result = session_live_gpu.shell("ls -la /home/coder/workspace")
print(result.output)
result = session_live_gpu.shell("ls -la ~/workspace")
print(result.output)
# Create directories and files
result = session_live_gpu.shell("mkdir -p ~/workspace/models && ls ~/workspace")
print(result.output)
# Chain multiple commands
result = session_live_gpu.shell("cd ~/workspace && mkdir -p data && ls -la")
print(result.output)
# Check disk space
result = session_live_gpu.shell("df -h")
print(result.output)
# Check memory
result = session_live_gpu.shell("free -h")
print(result.output)
# Install packages
result = session_live_gpu.shell("pip install transformers datasets")
print(result.output)
# Download files
result = session_live_gpu.shell(
"wget https://archive.ics.uci.edu/static/public/53/iris.zip -O ~/workspace/data.zip"
)
print(result.output)
result = session_live_gpu.shell(
"wget https://huggingface.co/datasets/scikit-learn/iris/resolve/main/Iris.csv -O ~/workspace/data.csv"
)
print(result.output)
Checking Exit Codes
result = session_live_gpu.shell("ls /nonexistent")
print(f"Exit code: {result.exit_code}")
print(f"Success: {result.success}")
print(f"result content : {result}")
print(f"result output : {result.output}")
Variable Transfer
Important: set() and get() require persistent mode (start()/stop())
so that variables persist between calls.
Sending Variables to session_live_gpu
# Start persistent mode (variables persist between calls)
session_live_gpu.start()
# Send local data to the remote session
data = [1, 2, 3, 4, 5, 99]
session_live_gpu.set("my_data", data)
# Use it in remote code
session_live_gpu.run("print(f'Data: {my_data}')")
session_live_gpu.run("print(f'Sum: {sum(my_data)}')")
session_live_gpu.stop()
Retrieving Variables from session_live_gpu
session_live_gpu.start()
# Compute something on the remote session
session_live_gpu.run("""
import torch
tensor = torch.randn(100, 100).cuda()
result_stats = {
'mean': tensor.mean().item(),
'std': tensor.std().item(),
'shape': list(tensor.shape)
}
""")
# Get the result locally
stats = session_live_gpu.get("result_stats")
print(f"Mean: {stats['mean']:.4f}")
print(f"Std: {stats['std']:.4f}")
print(f"Shape: {stats['shape']}")
session_live_gpu.stop()
Sending Complex Objects
import numpy as np
session_live_gpu.start()
# Send numpy arrays
arr = np.random.randn(100, 100)
session_live_gpu.set("numpy_array", arr)
# Send dictionaries
config = {
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 100
}
session_live_gpu.set("config", config)
# Use in remote code
session_live_gpu.run("""
import torch
tensor = torch.from_numpy(numpy_array).cuda()
print(f"Learning rate: {config['learning_rate']}")
""")
session_live_gpu.stop()
File Transfer
Transfer files and folders between your local machine and the remote session session.
Uploading a Single File
# Upload a local file to the remote session
session_live_gpu.upload("./data.csv", "/home/coder/workspace/data.csv")
# Upload with custom path
session_live_gpu.upload("./model.pkl", "/home/coder/workspace/models/trained_model.pkl")
# Disable progress output
session_live_gpu.upload("./config.json", "/home/coder/workspace/config.json", show_progress=False)
Downloading a Single File
# Download a file from the remote session
session_live_gpu.download("/home/coder/workspace/results.csv", "./results.csv")
# Download trained model
session_live_gpu.download("/home/coder/workspace/checkpoints/model.pt", "./local_model.pt")
# Download silently
session_live_gpu.download("/home/coder/workspace/logs.txt", "./logs.txt", show_progress=False)
Uploading a Folder
Upload an entire directory with all its contents:
# Upload a project folder
session_live_gpu.upload_folder("./my_project", "/home/coder/workspace/project")
# Upload with exclusions (default excludes: __pycache__, .git, *.pyc, .DS_Store, node_modules)
session_live_gpu.upload_folder(
"./my_project",
"/home/coder/workspace/project",
exclude=["*.log", ".env", "__pycache__", ".git"]
)
# Upload data folder
session_live_gpu.upload_folder("./datasets", "/home/coder/workspace/data")
Downloading a Folder
Download an entire directory with all its contents:
# Download results folder
session_live_gpu.download_folder("/home/coder/workspace/results", "./local_results")
# Download checkpoints
session_live_gpu.download_folder(
"/home/coder/workspace/checkpoints",
"./checkpoints",
exclude=["*.tmp", "*.log"]
)
# Download trained models
session_live_gpu.download_folder("/home/coder/workspace/models", "./downloaded_models")
Listing Remote Files
# List files in a directory
files = session_live_gpu.list_files("/home/coder/workspace")
for f in files:
icon = "📁" if f["is_dir"] else "📄"
print(f"{icon} {f['name']} - {f['size']} bytes")
# Filter by pattern
python_files = session_live_gpu.list_files("/home/coder/workspace", pattern="*.py")
for f in python_files:
print(f"📄 {f['name']}")
# List with full details
files = session_live_gpu.list_files("/home/coder/workspace")
for f in files:
print(f"Name: {f['name']}")
print(f" Path: {f['path']}")
print(f" Size: {f['size']} bytes")
print(f" Is Directory: {f['is_dir']}")
print(f" Modified: {f['modified']}")
Checking if a File Exists
# Check before downloading
if session_live_gpu.file_exists("/home/coder/workspace/model.pt"):
session_live_gpu.download("/home/coder/workspace/model.pt", "./model.pt")
print("Model downloaded!")
else:
print("Model not found, training required...")
# Check multiple files
files_to_check = ["config.json", "data.csv", "model.pt"]
for filename in files_to_check:
path = f"/home/coder/workspace/{filename}"
exists = session_live_gpu.file_exists(path)
status = "✓" if exists else "✗"
print(f"{status} {filename}")
Complete Workflow Example
from clouditia import GPUSession
session_live_gpu = GPUSession("ck_your_api_key")
# 1. Upload training data and code
session_live_gpu.upload_folder("./training_code", "/home/coder/workspace/code")
session_live_gpu.upload("./data/train.csv", "/home/coder/workspace/data/train.csv")
session_live_gpu.upload("./data/test.csv", "/home/coder/workspace/data/test.csv")
# 2. Run training
result = session_live_gpu.run("""
import sys
sys.path.insert(0, '/home/coder/workspace/code')
from train import train_model
model = train_model('/home/coder/workspace/data/train.csv')
model.save('/home/coder/workspace/output/model.pt')
print("Training complete!")
""")
# 3. Check and download results
if session_live_gpu.file_exists("/home/coder/workspace/output/model.pt"):
session_live_gpu.download("/home/coder/workspace/output/model.pt", "./trained_model.pt")
print("Model saved locally!")
# 4. Download all outputs
session_live_gpu.download_folder("/home/coder/workspace/output", "./results")
print("All results downloaded!")
# 5. List what was created
files = session_live_gpu.list_files("/home/coder/workspace/output")
print(f"Created {len(files)} files during training")
Working with Different File Types
# CSV files
session_live_gpu.upload("./data.csv", "/home/coder/workspace/data.csv")
# Pickle files (models, data)
session_live_gpu.upload("./model.pkl", "/home/coder/workspace/model.pkl")
# PyTorch models
session_live_gpu.download("/home/coder/workspace/checkpoint.pt", "./checkpoint.pt")
# JSON configuration
session_live_gpu.upload("./config.json", "/home/coder/workspace/config.json")
# Text files
session_live_gpu.upload("./requirements.txt", "/home/coder/workspace/requirements.txt")
# Binary files
session_live_gpu.upload("./image.png", "/home/coder/workspace/image.png")
# Any file type works!
session_live_gpu.upload("./data.parquet", "/home/coder/workspace/data.parquet")
session_live_gpu.upload("./weights.h5", "/home/coder/workspace/weights.h5")
S3 Output
Save your outputs directly to Amazon S3 or compatible storage (MinIO, etc.).
Installation
To use S3 features, install with the s3 extra:
pip install clouditia[s3]
Creating an S3 Connection
Two ways to create an S3 connection:
from clouditia import GPUSession, S3Connection
session_live_gpu = GPUSession("sk_live_your_api_key")
# Method 1: via session (recommended)
s3 = session_live_gpu.s3_connect(
bucket="my-ml-outputs",
access_key="AKIAIOSFODNN7EXAMPLE",
secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
region="eu-west-1",
prefix="experiments/run-001/" # Optional: prefix for all uploads
)
# Method 2: create S3Connection directly
s3 = S3Connection(
bucket="my-ml-outputs",
access_key="AKIAIOSFODNN7EXAMPLE",
secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
endpoint="https://s3.amazonaws.com", # Default: AWS S3
region="eu-west-1",
prefix="experiments/run-001/"
)
output() vs output_file()
Two methods to save data to S3:
-
output(filename, data, s3)— saves a Python object in memory (variable, array, dict) to S3. The SDK serializes it automatically based on the file extension. The object doesn't need to exist on disk. -
output_file(local_path, s3)— uploads an existing file from your local disk to S3. Useful for files generated by a script or downloaded.
Saving Python Objects to S3 (output)
The serialization format is auto-detected from the file extension:
.pt,.pth: PyTorch state dict.npy: NumPy array.json: JSON data.pkl,.pickle: Pickle format (default)
# Save NumPy arrays
import numpy as np
embeddings = np.random.randn(1000, 768)
url = session_live_gpu.output("embeddings_aina_23052026.npy", embeddings, s3)
# Save JSON metrics
metrics = {"accuracy": 0.95, "loss": 0.05, "epoch": 100}
url = session_live_gpu.output("metrics.json", metrics, s3)
# Save any picklable object
results = {"predictions": [1, 2, 3], "embeddings": embeddings, "metrics": metrics}
url = session_live_gpu.output("results.pkl", results, s3)
Uploading Local Files to S3 (output_file)
# Upload a file that already exists on your local disk
url = session_live_gpu.output_file("./checkpoints/best_model.pt", s3)
print(f"Uploaded to: {url}")
# remote_filename: choose the path and name of the file on S3
# By default, the file keeps its local name (best_model.pt)
# With remote_filename, you choose the S3 path structure
url = session_live_gpu.output_file(
"./model.pt", # local file
s3,
remote_filename="models/production/v2.0/model.pt" # path on S3
)
# Result on S3: s3://my-bucket/prefix/models/production/v2.0/model.pt
Saving Remote Session Data to S3 (remote_output / remote_output_file)
The output() and output_file() methods save local objects/files to S3.
The remote_output() and remote_output_file() methods save objects/files from the remote session directly to S3, without transiting through your local machine.
| Methode | Source | Destination | Transit local |
|---|---|---|---|
output() |
local Python object (in memory) | S3 | yes |
output_file() |
local file (on disk) | S3 | yes |
remote_output() |
Python variable on remote session | S3 | no |
remote_output_file() |
file on remote session | S3 | no |
# remote_output() requires persistent mode (variable must stay in memory)
session_live_gpu.start()
session_live_gpu.run("""
import torch
model = torch.nn.Linear(784, 10).cuda()
optimizer = torch.optim.Adam(model.parameters())
# ... training ...
results = {"accuracy": 0.95, "loss": 0.05, "epochs": 100}
torch.save(model.state_dict(), "/home/coder/workspace/model.pt")
""")
# Save the 'results' variable from the remote session to S3 (JSON format)
url = session_live_gpu.remote_output("results.json", "results", s3)
session_live_gpu.stop()
# remote_output_file() does NOT need start()/stop()
# because the file is on the pod's disk (it persists between calls)
url = session_live_gpu.remote_output_file("/home/coder/workspace/model.pt", s3)
# With a custom name on S3
url = session_live_gpu.remote_output_file(
"/home/coder/workspace/model.pt",
s3,
s3_filename="models/production/v3/model.pt"
)
Using with MinIO or Other S3-Compatible Storage
# MinIO connection
s3_minio = session_live_gpu.s3_connect(
bucket="ml-outputs",
access_key="minio_user",
secret_key="minio_password",
endpoint="https://minio.endpoint.url", # Custom endpoint : "http://minio.local:9000"
region="us-east-1"
)
metrics_minio = {"accuracy": 0.95, "loss": 0.05, "epoch": 100}
session_live_gpu.output("metrics_minio_ok", metrics_minio, s3_minio)
Complete Training Workflow with S3 Output
from clouditia import GPUSession
session_live_gpu = GPUSession("sk_live_your_api_key")
# Configure S3 output
s3 = session_live_gpu.s3_connect(
bucket="my-training-outputs",
access_key="AKIA...",
secret_key="...",
prefix="training/experiment-001/"
)
# Start persistent session for training
session_live_gpu.start()
# Setup
session_live_gpu.run("""
import torch
import torch.nn as nn
model = nn.Linear(100, 10).cuda()
optimizer = torch.optim.Adam(model.parameters())
""")
# Training loop
session_live_gpu.run("""
for epoch in range(100):
x = torch.randn(32, 100).cuda()
y = model(x)
loss = y.sum()
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
""")
# Get model state and save to S3
session_live_gpu.run("final_state = model.state_dict()")
model_state = session_live_gpu.get("final_state")
url = session_live_gpu.output("trained_model.pt", model_state, s3)
print(f"Model saved to: {url}")
# Save training metrics
metrics = {"final_loss": 0.05, "epochs": 100}
session_live_gpu.output("metrics.json", metrics, s3)
session_live_gpu.stop()
Remote Functions (Decorator)
Use the @session_live_gpu.remote decorator to run functions on the remote session:
from clouditia import GPUSession
session_live_gpu = GPUSession("ck_your_api_key")
@session_live_gpu.remote
def compute_on_gpu(data, power=2):
import torch
tensor = torch.tensor(data, device='cuda', dtype=torch.float32)
result = tensor ** power
return result.cpu().tolist()
# Call the function - it runs on the remote session!
result = compute_on_gpu([1, 2, 3, 4, 5], power=2)
print(result) # [1.0, 4.0, 9.0, 16.0, 25.0]
Remote Function with Model
@session_live_gpu.remote
def train_step(batch_data, learning_rate=0.01):
import torch
import torch.nn as nn
# Create model (or load from checkpoint)
model = nn.Sequential(
nn.Linear(len(batch_data), 64),
nn.ReLU(),
nn.Linear(64, 1)
).cuda()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
# Training step
x = torch.tensor(batch_data, dtype=torch.float32).cuda()
output = model(x)
loss = output.sum()
optimizer.zero_grad()
loss.backward()
optimizer.step()
return {"loss": loss.item()}
# Call it like a normal function
result = train_step([1.0, 2.0, 3.0, 4.0], learning_rate=0.001)
print(f"Loss: {result['loss']}")
Async Remote Functions
@session_live_gpu.remote(async_mode=True)
def long_training():
import torch
for epoch in range(100):
print(f"Epoch {epoch}/100")
# ... training code ...
return {"status": "completed"}
# Returns an AsyncJob instead of waiting
job = long_training()
print(f"Job submitted: {job.job_id}")
# Wait for completion
result = job.wait(show_logs=True)
Async Jobs (Long-Running Tasks)
For tasks that take hours or days, use async jobs:
Submitting a Job
# Submit a long-running job
job = session_live_gpu.submit("""
import torch
import time
print("Starting training...")
for epoch in range(100):
print(f"Epoch {epoch + 1}/100")
time.sleep(1) # Simulate training
print("Training complete!")
torch.save({'epoch': 100}, '/home/coder/workspace/checkpoint.pt')
""", name="my_training")
print(f"Job ID: {job.job_id}")
Monitoring Progress
import time
# Poll for status
while not job.is_done():
status = job.status()
print(f"Status: {status}")
# View recent logs
if status == "running":
logs = job.logs(tail=10)
print(logs)
time.sleep(30)
print("Job finished!")
Real-Time Log Streaming
# View logs as they come in
while job.is_running():
new_logs = job.logs(new_only=True)
if new_logs.strip():
print(new_logs, end='')
time.sleep(5)
Waiting for Completion
# Wait with live log output
result = job.wait(show_logs=True)
# Or wait with timeout
try:
result = job.wait(timeout=3600) # 1 hour max
except TimeoutError:
print("Job taking too long, cancelling...")
job.cancel()
Getting Results
# Wait for the job to complete before getting the result
job.wait() # bloque jusqu'a completion
# Get the result
result = job.result()
if job.status == "running":
print("Job still running...")
elif result.success:
print("Job completed successfully!")
print(result.output)
else:
print(f"Job failed: {result.error}")
Listing Jobs
# List all jobs
jobs = session_live_gpu.jobs()
for j in jobs:
print(f"{j.name}: {j.status()}")
# List only running jobs
running_jobs = session_live_gpu.jobs(status="running")
# List completed jobs
completed_jobs = session_live_gpu.jobs(status="completed", limit=5)
Cancelling Jobs
if job.is_running():
job.cancel()
print("Job cancelled")
Shell Jobs
# Submit a shell command as an async job
job = session_live_gpu.submit(
"pip install transformers && python /home/coder/workspace/train.py",
name="install_and_train",
job_type="shell"
)
Jupyter Magic
Use Clouditia directly in Jupyter notebooks with magic commands.
Loading the Extension
# In a Jupyter cell
%load_ext clouditia
# Set your API key
CLOUDITIA_API_KEY = "ck_your_api_key"
Running Code on Remote Session
%%clouditia
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
x = torch.randn(1000, 1000, device='cuda')
y = torch.randn(1000, 1000, device='cuda')
z = torch.matmul(x, y)
print(f"Result shape: {z.shape}")
Specifying API Key Directly
%%clouditia ck_your_api_key
print("Hello from GPU!")
Async Mode in Jupyter
%%clouditia --async
for epoch in range(100):
print(f"Epoch {epoch}")
# ... training code ...
# The job is submitted and _clouditia_job variable is set
# Check job status
_clouditia_job.status()
# View logs
print(_clouditia_job.logs())
Utility Magic Commands
# Check session status
%clouditia_status
# List recent jobs
%clouditia_jobs
# List only running jobs
%clouditia_jobs running
Error Handling
The SDK provides specific exceptions for different error types:
from clouditia import (
GPUSession,
ClouditiaError,
AuthenticationError,
SessionError,
ExecutionError,
TimeoutError,
CommandBlockedError
)
session_live_gpu = GPUSession("ck_your_api_key")
try:
result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
except AuthenticationError:
print("Invalid API key")
except SessionError:
print("Session not running or not accessible")
except ExecutionError as e:
print(f"Code execution failed: {e}")
except TimeoutError:
print("Execution timed out - consider using async jobs")
except CommandBlockedError:
print("Command blocked by security filters")
except ClouditiaError as e:
print(f"General error: {e}")
Using raise_for_status()
result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
result.raise_for_status() # Raises ExecutionError if failed
print(result.output)
API Reference
GPUSession
GPUSession(
api_key: str,
base_url: str = "https://clouditia.com/code-editor",
timeout: int = 120,
poll_interval: int = 5
)
Methods:
| Method | Description |
|---|---|
| Connection & Info | |
verify() |
Verify API key and get session info |
wait_until_ready(timeout=600) |
Wait until session is fully ready (workspace restored) |
gpu_info() |
Get GPU information (name, memory, CUDA version) |
| Code Execution | |
run(code, timeout=None, stream=True) |
Execute Python code (REPL-like: captures last expression) |
exec(code, timeout=None) |
Execute code, raises ExecutionError on failure |
shell(command, timeout=None) |
Execute shell command (security-filtered) |
| Persistent Mode | |
start() |
Start a persistent session (variables persist between calls) |
stop() |
Stop the persistent session |
set(name, value) |
Send a variable to the remote session |
get(name) |
Retrieve a variable from the remote session |
| File Transfer | |
upload(local_path, remote_path, show_progress=True) |
Upload a file (auto-chunked for large files) |
download(remote_path, local_path, show_progress=True) |
Download a file (auto-chunked for large files) |
upload_folder(local_path, remote_path, exclude=None) |
Upload a folder (compressed + chunked) |
download_folder(remote_path, local_path, exclude=None) |
Download a folder (compressed + chunked) |
list_files(remote_path, pattern=None) |
List files in remote directory |
file_exists(remote_path) |
Check if a file exists on remote |
| S3 Output (local) | |
s3_connect(bucket, access_key, secret_key, ...) |
Create S3 connection |
output(filename, data, s3_connection) |
Save local Python object to S3 |
output_file(local_path, s3_connection, remote_filename=None) |
Upload local file to S3 |
| S3 Output (remote — no local transit) | |
remote_output(filename, variable_name, s3) |
Save remote session variable directly to S3 |
remote_output_file(remote_path, s3, s3_filename=None) |
Upload remote session file directly to S3 |
| Async Jobs | |
submit(code, name=None, job_type="python") |
Submit async background job |
jobs(status=None, limit=10) |
List async jobs |
| Decorator | |
@remote |
Decorator to run a function on the remote session |
Properties:
| Property | Type | Description |
|---|---|---|
is_persistent |
bool | True if a persistent session is active |
api_key |
str | The API key used for authentication |
base_url |
str | The API base URL |
timeout |
int | Default timeout in seconds |
ExecutionResult
ExecutionResult(
output: str, # all output (print statements + last expression)
result: Any, # value of the last line if it's an expression (None otherwise)
error: str, # error message if failed
exit_code: int, # process exit code
success: bool # True if execution succeeded
)
Difference between output and result:
output= all stdout + last expression (like a Jupyter cell)result= only the last expression for programmatic use (e.g.,int(result.result))
Methods:
| Method | Description |
|---|---|
raise_for_status() |
Raise exception if failed |
to_dict() |
Convert to dictionary |
__bool__() |
True if success=True (use in if result:) |
__str__() |
Returns output if success, "Error: ..." otherwise |
AsyncJob
AsyncJob(session, job_id, name=None)
Methods:
| Method | Description |
|---|---|
status() |
Get current status |
is_done() |
Check if finished |
is_running() |
Check if running |
is_pending() |
Check if pending |
logs(tail=50, new_only=False) |
Get logs |
result() |
Get final result |
cancel() |
Cancel the job |
wait(timeout=None, show_logs=False) |
Wait for completion |
get_info() |
Get detailed job info |
Exceptions
| Exception | Description |
|---|---|
ClouditiaError |
Base exception for all Clouditia errors |
AuthenticationError |
Invalid or expired API key |
SessionError |
Session not found, not running, or not accessible |
ExecutionError |
Code execution failed on the remote session |
TimeoutError |
Execution timed out |
CommandBlockedError |
Command blocked by security filters |
Hierarchy: All exceptions inherit from ClouditiaError.
Support
- Documentation: https://clouditia.com/docsapisession
- Email: support@clouditia.com
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clouditia-1.9.1.tar.gz.
File metadata
- Download URL: clouditia-1.9.1.tar.gz
- Upload date:
- Size: 57.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c50d36590ae7c3136ea48de8f8500ffd4f036690ee599eedf01e0d044cd9f085
|
|
| MD5 |
a7cca0eb2edc81a7156e4d7071d01231
|
|
| BLAKE2b-256 |
56896441adf59ace71bf4bf17189413201dd4eabee667720f9eb59934eded84a
|
File details
Details for the file clouditia-1.9.1-py3-none-any.whl.
File metadata
- Download URL: clouditia-1.9.1-py3-none-any.whl
- Upload date:
- Size: 37.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f55dc0f5dd4cff49a8d3f771e4a2efb58cfee0ba5a70fd83bee0426b8d120873
|
|
| MD5 |
40297e44bc5e2e2c3d9473112ece6e86
|
|
| BLAKE2b-256 |
a902af0c1e0e3cca4fd445754ba13c200df42fc5b0be8121238636ccbed026f2
|