Execute Python and Shell code on remote GPU sessions

These details have not been verified by PyPI

Project links

Project description

Clouditia SDK

Execute Python and Shell code on remote session sessions.

Clouditia SDK provides a simple Python interface to run code on remote session-powered containers. Perfect for machine learning, deep learning, and any GPU-accelerated workloads.

Installation

pip install clouditia

# With S3 support for saving outputs
pip install clouditia[s3]

Quick Start

from clouditia import GPUSession

# Connect to your GPU session
session_live_gpu = GPUSession("ck_your_api_key")

# Execute Python code on the remote session
result = session_live_gpu.run("""
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")
""")

print(result.output)

Features

Python Execution: Run Python code on remote sessions
Shell Commands: Execute shell commands on the remote session pod
Persistent Sessions: Keep variables between executions with start()/stop()
Variable Transfer: Send and retrieve variables between local and remote
File Transfer: Upload/download files and folders between local and remote
S3 Output: Save outputs directly to S3 buckets
Async Jobs: Submit long-running tasks with real-time log monitoring
Jupyter Magic: Use %%clouditia magic in notebooks
Decorator Support: Use @session_live_gpu.remote to run functions on the remote session

Getting Your API Key
Basic Usage
Persistent Sessions
Executing Python Code
Shell Commands
Variable Transfer
File Transfer
S3 Output
Remote Functions (Decorator)
Async Jobs (Long-Running Tasks)
Jupyter Magic
Error Handling
API Reference

Getting Your API Key

Log in to clouditia.com
Start a GPU session
Go to API Keys in your session dashboard
Generate a new API key (starts with ck_ or sk_)

Basic Usage

Connect to a Session

from clouditia import GPUSession

# Create a session with your API key
session_live_gpu = GPUSession("ck_your_api_key_here")

# Verify the connection
info = session_live_gpu.verify()
print(f"Connected to: {info['session_name']}")
print(f"GPU: {info['gpu_type']}")
print(f"Credit remaining: {info['user_credit']}€")

Waiting for a Session to Be Ready

A GPU session can have several intermediate states before being fully usable:

creating: the pod is being scheduled on a compute node.
running but workspace still downloading: when the session is resumed from a custom environment (venv), the workspace (models, datasets, caches…) is streamed back from S3 at pod startup. For a small workspace this takes a few seconds, for a vLLM cache with 70,000+ files and 16 GB of data it can take 10+ minutes.

The SDK exposes two fields to handle this:

ready: bool — True only when the session is fully usable (status running AND any workspace download complete).
estimated_ready_in_seconds: int | None — ETA until ready.
workspace_sync — live progress of the workspace download: {in_progress, bytes_done, bytes_total, files_done, pct, rate_bps, eta_seconds}.

Quick check

info = session_live_gpu.verify()
if info['ready']:
    print("Session is ready!")
else:
    ws = info.get('workspace_sync') or {}
    if ws.get('in_progress'):
        print(f"Workspace: {ws['pct']}% ({ws['bytes_done']}/{ws['bytes_total']} bytes)")
        print(f"ETA: {info['estimated_ready_in_seconds']} seconds")
    else:
        print(f"Waiting — status={info['status']}")

Blocking helper: `wait_until_ready()`

The cleanest way to wait for a resumed session is to call wait_until_ready(). It polls verify() every few seconds and prints a live progress line until the session is ready (or timeout).

session_live_gpu = GPUSession("ck_your_api_key")

# Block until the workspace is fully restored and VS Code/Jupyter is up
if session_live_gpu.wait_until_ready(timeout=1200):  # wait max 20 min
    # Safe to run code now
    result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
else:
    print("Session failed to become ready in time")

Output during a typical vLLM resume:

⏳ Workspace: 2.34/16.23 GB (14%) @ 22.1 MB/s — ETA 10min 32s
⏳ Workspace: 3.12/16.23 GB (19%) @ 22.5 MB/s — ETA 9min 41s
⏳ Workspace: 4.01/16.23 GB (25%) @ 22.3 MB/s — ETA 9min 5s
...
✅ Session ready!

Parameters:

timeout: int = 1800 — max total wait time in seconds (default 30 min).
poll_interval: int = 5 — delay between polls in seconds.
verbose: bool = True — print progress updates to stdout.

Persistent Sessions

By default, each run() call executes in an isolated environment - variables don't persist between calls. Use start() and stop() to enable persistent sessions where variables are preserved.

Isolated Mode (Default)

# Without start(), variables are NOT persistent
session_live_gpu.run("x = 10")
session_live_gpu.run("print(x)")  # Error: x is not defined

Persistent Mode

# Start a persistent session
session_live_gpu.start()
print(f"Session active: {session_live_gpu.is_persistent}")  # True

# Variables now persist between run() calls
session_live_gpu.run("x = 10")
session_live_gpu.run("y = 20")
session_live_gpu.run("z = x + y")
result = session_live_gpu.run("print(f'Result: {z}')")
# Output: Result: 30

# Stop the session when done
session_live_gpu.stop()
print(f"Session active: {session_live_gpu.is_persistent}")  # False

Full Example

from clouditia import GPUSession

session_live_gpu = GPUSession("ck_your_api_key")

# Start persistent session
session_live_gpu.start()

# Build up state across multiple calls
session_live_gpu.run("import torch")
session_live_gpu.run("model = torch.nn.Linear(10, 5).cuda()")
session_live_gpu.run("data = torch.randn(32, 10).cuda()")

# Use the accumulated state
result = session_live_gpu.run("""
output = model(data)
print(f"Input shape: {data.shape}")
print(f"Output shape: {output.shape}")
""")

# Clean up
session_live_gpu.stop()

Checking Session State

# Check if a persistent session is active
if session_live_gpu.is_persistent:
    print("Persistent session is running")
else:
    print("Running in isolated mode")

Executing Python Code

Simple Execution

# Run Python code and get the output
result = session_live_gpu.run("print('Hello from the GPU!')")
print(result.output)  # "Hello from the GPU!"

# Check if execution was successful
if result.success:
    print("Code executed successfully!")
else:
    print(f"Error: {result.error}")

output vs result

result.output — contains all output from the executed code (print statements + last expression), like a Jupyter cell
result.result — contains only the value of the last line if it's an expression (for programmatic use)

# Expression seule
result = session_live_gpu.run("2 + 2")
print(result.output)  # "4"
print(result.result)  # "4"

# List comprehension
result = session_live_gpu.run("[i**2 for i in range(5)]")
print(result.output)  # "[0, 1, 4, 9, 16]"
print(result.result)  # "[0, 1, 4, 9, 16]"

# print() + expression : output contient tout, result contient la derniere valeur
result = session_live_gpu.run("x = 10\nprint(f'x = {x}')\nx * 2")
print(result.output)  # "x = 10\n20"
print(result.result)  # "20"

# Statements seuls (pas d'expression en derniere ligne)
result = session_live_gpu.run("print('hello')")
print(result.output)  # "hello"
print(result.result)  # None

Multi-line Code

result = session_live_gpu.run("""
import torch
import torch.nn as nn

# Create a simple model
model = nn.Linear(10, 5).cuda()
x = torch.randn(32, 10).cuda()
output = model(x)

print(f"Input shape: {x.shape}")
print(f"Output shape: {output.shape}")
print(f"Model parameters: {sum(p.numel() for p in model.parameters())}")
""")

print(result.output)

run() vs exec()

run() — retourne un ExecutionResult avec output, result, success. Vous gerez les erreurs vous-meme
exec() — returns True if OK, raises an exception ExecutionError if the code fails. Shortcut for code that doesn't need a return value.

Both execute code the same way. The only difference is error handling.

Important: Each run() or exec() call is isolated — variables don't persist between calls. To persist variables, use persistent mode (see Persistent Sessions section):

# ERREUR: chaque exec() est isole, torch n'est pas connu au 2e appel
session_live_gpu.exec("import torch")
session_live_gpu.exec("model = torch.nn.Linear(10, 5).cuda()")  # NameError!

# CORRECT: tout dans un seul appel
session_live_gpu.exec("""
import torch
model = torch.nn.Linear(10, 5).cuda()
optimizer = torch.optim.Adam(model.parameters())
print(f"Model parameters: {sum(p.numel() for p in model.parameters())}")
""")

# CORRECT: ou utiliser le mode persistent
session_live_gpu.start()  # Active le mode persistent
session_live_gpu.exec("import torch")
session_live_gpu.exec("model = torch.nn.Linear(10, 5).cuda()")  # torch est connu
session_live_gpu.exec("optimizer = torch.optim.Adam(model.parameters())")
session_live_gpu.stop()

Shell Commands

Execute shell commands on the remote session pod:

# Check current directory
result = session_live_gpu.shell("pwd")
print(result.output)  # /home/coder/workspace

# List files (chemin complet ou ~/workspace)
result = session_live_gpu.shell("ls -la /home/coder/workspace")
print(result.output)

result = session_live_gpu.shell("ls -la ~/workspace")
print(result.output)

# Create directories and files
result = session_live_gpu.shell("mkdir -p ~/workspace/models && ls ~/workspace")
print(result.output)

# Chain multiple commands
result = session_live_gpu.shell("cd ~/workspace && mkdir -p data && ls -la")
print(result.output)

# Check disk space
result = session_live_gpu.shell("df -h")
print(result.output)

# Check memory
result = session_live_gpu.shell("free -h")
print(result.output)

# Install packages
result = session_live_gpu.shell("pip install transformers datasets")
print(result.output)

# Download files
result = session_live_gpu.shell(
    "wget https://archive.ics.uci.edu/static/public/53/iris.zip -O ~/workspace/data.zip"
)
print(result.output)

result = session_live_gpu.shell(
    "wget https://huggingface.co/datasets/scikit-learn/iris/resolve/main/Iris.csv -O ~/workspace/data.csv"
)
print(result.output)

Checking Exit Codes

result = session_live_gpu.shell("ls /nonexistent")
print(f"Exit code: {result.exit_code}")
print(f"Success: {result.success}")
print(f"result content : {result}")
print(f"result output : {result.output}")

Variable Transfer

Important: set() and get() require persistent mode (start()/stop()) so that variables persist between calls.

Sending Variables to session_live_gpu

# Start persistent mode (variables persist between calls)
session_live_gpu.start()

# Send local data to the remote session
data = [1, 2, 3, 4, 5, 99]
session_live_gpu.set("my_data", data)

# Use it in remote code
session_live_gpu.run("print(f'Data: {my_data}')")
session_live_gpu.run("print(f'Sum: {sum(my_data)}')")

session_live_gpu.stop()

Retrieving Variables from session_live_gpu

session_live_gpu.start()

# Compute something on the remote session
session_live_gpu.run("""
import torch
tensor = torch.randn(100, 100).cuda()
result_stats = {
    'mean': tensor.mean().item(),
    'std': tensor.std().item(),
    'shape': list(tensor.shape)
}
""")

# Get the result locally
stats = session_live_gpu.get("result_stats")
print(f"Mean: {stats['mean']:.4f}")
print(f"Std: {stats['std']:.4f}")
print(f"Shape: {stats['shape']}")

session_live_gpu.stop()

Sending Complex Objects

import numpy as np

session_live_gpu.start()

# Send numpy arrays
arr = np.random.randn(100, 100)
session_live_gpu.set("numpy_array", arr)

# Send dictionaries
config = {
    "learning_rate": 0.001,
    "batch_size": 32,
    "epochs": 100
}
session_live_gpu.set("config", config)

# Use in remote code
session_live_gpu.run("""
import torch
tensor = torch.from_numpy(numpy_array).cuda()
print(f"Learning rate: {config['learning_rate']}")
""")

session_live_gpu.stop()

File Transfer

Transfer files and folders between your local machine and the remote session session.

Uploading a Single File

# Upload a local file to the remote session
session_live_gpu.upload("./data.csv", "/home/coder/workspace/data.csv")

# Upload with custom path
session_live_gpu.upload("./model.pkl", "/home/coder/workspace/models/trained_model.pkl")

# Disable progress output
session_live_gpu.upload("./config.json", "/home/coder/workspace/config.json", show_progress=False)

Downloading a Single File

# Download a file from the remote session
session_live_gpu.download("/home/coder/workspace/results.csv", "./results.csv")

# Download trained model
session_live_gpu.download("/home/coder/workspace/checkpoints/model.pt", "./local_model.pt")

# Download silently
session_live_gpu.download("/home/coder/workspace/logs.txt", "./logs.txt", show_progress=False)

Uploading a Folder

Upload an entire directory with all its contents:

# Upload a project folder
session_live_gpu.upload_folder("./my_project", "/home/coder/workspace/project")

# Upload with exclusions (default excludes: __pycache__, .git, *.pyc, .DS_Store, node_modules)
session_live_gpu.upload_folder(
    "./my_project",
    "/home/coder/workspace/project",
    exclude=["*.log", ".env", "__pycache__", ".git"]
)

# Upload data folder
session_live_gpu.upload_folder("./datasets", "/home/coder/workspace/data")

Downloading a Folder

Download an entire directory with all its contents:

# Download results folder
session_live_gpu.download_folder("/home/coder/workspace/results", "./local_results")

# Download checkpoints
session_live_gpu.download_folder(
    "/home/coder/workspace/checkpoints",
    "./checkpoints",
    exclude=["*.tmp", "*.log"]
)

# Download trained models
session_live_gpu.download_folder("/home/coder/workspace/models", "./downloaded_models")

Listing Remote Files

# List files in a directory
files = session_live_gpu.list_files("/home/coder/workspace")
for f in files:
    icon = "📁" if f["is_dir"] else "📄"
    print(f"{icon} {f['name']} - {f['size']} bytes")

# Filter by pattern
python_files = session_live_gpu.list_files("/home/coder/workspace", pattern="*.py")
for f in python_files:
    print(f"📄 {f['name']}")

# List with full details
files = session_live_gpu.list_files("/home/coder/workspace")
for f in files:
    print(f"Name: {f['name']}")
    print(f"  Path: {f['path']}")
    print(f"  Size: {f['size']} bytes")
    print(f"  Is Directory: {f['is_dir']}")
    print(f"  Modified: {f['modified']}")

Checking if a File Exists

# Check before downloading
if session_live_gpu.file_exists("/home/coder/workspace/model.pt"):
    session_live_gpu.download("/home/coder/workspace/model.pt", "./model.pt")
    print("Model downloaded!")
else:
    print("Model not found, training required...")

# Check multiple files
files_to_check = ["config.json", "data.csv", "model.pt"]
for filename in files_to_check:
    path = f"/home/coder/workspace/{filename}"
    exists = session_live_gpu.file_exists(path)
    status = "✓" if exists else "✗"
    print(f"{status} {filename}")

Complete Workflow Example

from clouditia import GPUSession

session_live_gpu = GPUSession("ck_your_api_key")

# 1. Upload training data and code
session_live_gpu.upload_folder("./training_code", "/home/coder/workspace/code")
session_live_gpu.upload("./data/train.csv", "/home/coder/workspace/data/train.csv")
session_live_gpu.upload("./data/test.csv", "/home/coder/workspace/data/test.csv")

# 2. Run training
result = session_live_gpu.run("""
import sys
sys.path.insert(0, '/home/coder/workspace/code')
from train import train_model

model = train_model('/home/coder/workspace/data/train.csv')
model.save('/home/coder/workspace/output/model.pt')
print("Training complete!")
""")

# 3. Check and download results
if session_live_gpu.file_exists("/home/coder/workspace/output/model.pt"):
    session_live_gpu.download("/home/coder/workspace/output/model.pt", "./trained_model.pt")
    print("Model saved locally!")

# 4. Download all outputs
session_live_gpu.download_folder("/home/coder/workspace/output", "./results")
print("All results downloaded!")

# 5. List what was created
files = session_live_gpu.list_files("/home/coder/workspace/output")
print(f"Created {len(files)} files during training")

Working with Different File Types

# CSV files
session_live_gpu.upload("./data.csv", "/home/coder/workspace/data.csv")

# Pickle files (models, data)
session_live_gpu.upload("./model.pkl", "/home/coder/workspace/model.pkl")

# PyTorch models
session_live_gpu.download("/home/coder/workspace/checkpoint.pt", "./checkpoint.pt")

# JSON configuration
session_live_gpu.upload("./config.json", "/home/coder/workspace/config.json")

# Text files
session_live_gpu.upload("./requirements.txt", "/home/coder/workspace/requirements.txt")

# Binary files
session_live_gpu.upload("./image.png", "/home/coder/workspace/image.png")

# Any file type works!
session_live_gpu.upload("./data.parquet", "/home/coder/workspace/data.parquet")
session_live_gpu.upload("./weights.h5", "/home/coder/workspace/weights.h5")

S3 Output

Save your outputs directly to Amazon S3 or compatible storage (MinIO, etc.).

Installation

To use S3 features, install with the s3 extra:

pip install clouditia[s3]

Creating an S3 Connection

Two ways to create an S3 connection:

from clouditia import GPUSession, S3Connection

session_live_gpu = GPUSession("sk_live_your_api_key")

# Method 1: via session (recommended)
s3 = session_live_gpu.s3_connect(
    bucket="my-ml-outputs",
    access_key="AKIAIOSFODNN7EXAMPLE",
    secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    region="eu-west-1",
    prefix="experiments/run-001/"  # Optional: prefix for all uploads
)

# Method 2: create S3Connection directly
s3 = S3Connection(
    bucket="my-ml-outputs",
    access_key="AKIAIOSFODNN7EXAMPLE",
    secret_key="wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    endpoint="https://s3.amazonaws.com",  # Default: AWS S3
    region="eu-west-1",
    prefix="experiments/run-001/"
)

output() vs output_file()

Two methods to save data to S3:

output(filename, data, s3) — saves a Python object in memory (variable, array, dict) to S3. The SDK serializes it automatically based on the file extension. The object doesn't need to exist on disk.
output_file(local_path, s3) — uploads an existing file from your local disk to S3. Useful for files generated by a script or downloaded.

Saving Python Objects to S3 (output)

The serialization format is auto-detected from the file extension:

.pt, .pth: PyTorch state dict
.npy: NumPy array
.json: JSON data
.pkl, .pickle: Pickle format (default)

# Save NumPy arrays
import numpy as np
embeddings = np.random.randn(1000, 768)
url = session_live_gpu.output("embeddings_aina_23052026.npy", embeddings, s3)

# Save JSON metrics
metrics = {"accuracy": 0.95, "loss": 0.05, "epoch": 100}
url = session_live_gpu.output("metrics.json", metrics, s3)

# Save any picklable object
results = {"predictions": [1, 2, 3], "embeddings": embeddings, "metrics": metrics}
url = session_live_gpu.output("results.pkl", results, s3)

Uploading Local Files to S3 (output_file)

# Upload a file that already exists on your local disk
url = session_live_gpu.output_file("./checkpoints/best_model.pt", s3)
print(f"Uploaded to: {url}")

# remote_filename: choose the path and name of the file on S3
# By default, the file keeps its local name (best_model.pt)
# With remote_filename, you choose the S3 path structure
url = session_live_gpu.output_file(
    "./model.pt",                                       # local file
    s3,
    remote_filename="models/production/v2.0/model.pt"   # path on S3
)
# Result on S3: s3://my-bucket/prefix/models/production/v2.0/model.pt

Saving Remote Session Data to S3 (remote_output / remote_output_file)

The output() and output_file() methods save local objects/files to S3. The remote_output() and remote_output_file() methods save objects/files from the remote session directly to S3, without transiting through your local machine.

Methode	Source	Destination	Transit local
`output()`	local Python object (in memory)	S3	yes
`output_file()`	local file (on disk)	S3	yes
`remote_output()`	Python variable on remote session	S3	no
`remote_output_file()`	file on remote session	S3	no

# remote_output() requires persistent mode (variable must stay in memory)
session_live_gpu.start()

session_live_gpu.run("""
import torch
model = torch.nn.Linear(784, 10).cuda()
optimizer = torch.optim.Adam(model.parameters())
# ... training ...
results = {"accuracy": 0.95, "loss": 0.05, "epochs": 100}
torch.save(model.state_dict(), "/home/coder/workspace/model.pt")
""")

# Save the 'results' variable from the remote session to S3 (JSON format)
url = session_live_gpu.remote_output("results.json", "results", s3)

session_live_gpu.stop()

# remote_output_file() does NOT need start()/stop()
# because the file is on the pod's disk (it persists between calls)
url = session_live_gpu.remote_output_file("/home/coder/workspace/model.pt", s3)

# With a custom name on S3
url = session_live_gpu.remote_output_file(
    "/home/coder/workspace/model.pt",
    s3,
    s3_filename="models/production/v3/model.pt"
)

Using with MinIO or Other S3-Compatible Storage

# MinIO connection
s3_minio = session_live_gpu.s3_connect(
    bucket="ml-outputs",
    access_key="minio_user",
    secret_key="minio_password",
    endpoint="https://minio.endpoint.url",  # Custom endpoint : "http://minio.local:9000"
    region="us-east-1"
)

metrics_minio = {"accuracy": 0.95, "loss": 0.05, "epoch": 100}
session_live_gpu.output("metrics_minio_ok", metrics_minio, s3_minio)

Complete Training Workflow with S3 Output

from clouditia import GPUSession

session_live_gpu = GPUSession("sk_live_your_api_key")

# Configure S3 output
s3 = session_live_gpu.s3_connect(
    bucket="my-training-outputs",
    access_key="AKIA...",
    secret_key="...",
    prefix="training/experiment-001/"
)

# Start persistent session for training
session_live_gpu.start()

# Setup
session_live_gpu.run("""
import torch
import torch.nn as nn

model = nn.Linear(100, 10).cuda()
optimizer = torch.optim.Adam(model.parameters())
""")

# Training loop
session_live_gpu.run("""
for epoch in range(100):
    x = torch.randn(32, 100).cuda()
    y = model(x)
    loss = y.sum()
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    if (epoch + 1) % 10 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
""")

# Get model state and save to S3
session_live_gpu.run("final_state = model.state_dict()")
model_state = session_live_gpu.get("final_state")

url = session_live_gpu.output("trained_model.pt", model_state, s3)
print(f"Model saved to: {url}")

# Save training metrics
metrics = {"final_loss": 0.05, "epochs": 100}
session_live_gpu.output("metrics.json", metrics, s3)

session_live_gpu.stop()

Remote Functions (Decorator)

Use the @session_live_gpu.remote decorator to run functions on the remote session:

from clouditia import GPUSession

session_live_gpu = GPUSession("ck_your_api_key")

@session_live_gpu.remote
def compute_on_gpu(data, power=2):
    import torch
    tensor = torch.tensor(data, device='cuda', dtype=torch.float32)
    result = tensor ** power
    return result.cpu().tolist()

# Call the function - it runs on the remote session!
result = compute_on_gpu([1, 2, 3, 4, 5], power=2)
print(result)  # [1.0, 4.0, 9.0, 16.0, 25.0]

Remote Function with Model

@session_live_gpu.remote
def train_step(batch_data, learning_rate=0.01):
    import torch
    import torch.nn as nn

    # Create model (or load from checkpoint)
    model = nn.Sequential(
        nn.Linear(len(batch_data), 64),
        nn.ReLU(),
        nn.Linear(64, 1)
    ).cuda()

    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

    # Training step
    x = torch.tensor(batch_data, dtype=torch.float32).cuda()
    output = model(x)
    loss = output.sum()

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    return {"loss": loss.item()}

# Call it like a normal function
result = train_step([1.0, 2.0, 3.0, 4.0], learning_rate=0.001)
print(f"Loss: {result['loss']}")

Async Remote Functions

@session_live_gpu.remote(async_mode=True)
def long_training():
    import torch
    for epoch in range(100):
        print(f"Epoch {epoch}/100")
        # ... training code ...
    return {"status": "completed"}

# Returns an AsyncJob instead of waiting
job = long_training()
print(f"Job submitted: {job.job_id}")

# Wait for completion
result = job.wait(show_logs=True)

Async Jobs (Long-Running Tasks)

For tasks that take hours or days, use async jobs:

Submitting a Job

# Submit a long-running job
job = session_live_gpu.submit("""
import torch
import time

print("Starting training...")
for epoch in range(100):
    print(f"Epoch {epoch + 1}/100")
    time.sleep(1)  # Simulate training

print("Training complete!")
torch.save({'epoch': 100}, '/home/coder/workspace/checkpoint.pt')
""", name="my_training")

print(f"Job ID: {job.job_id}")

Monitoring Progress

import time

# Poll for status
while not job.is_done():
    status = job.status()
    print(f"Status: {status}")

    # View recent logs
    if status == "running":
        logs = job.logs(tail=10)
        print(logs)

    time.sleep(30)

print("Job finished!")

Real-Time Log Streaming

# View logs as they come in
while job.is_running():
    new_logs = job.logs(new_only=True)
    if new_logs.strip():
        print(new_logs, end='')
    time.sleep(5)

Waiting for Completion

# Wait with live log output
result = job.wait(show_logs=True)

# Or wait with timeout
try:
    result = job.wait(timeout=3600)  # 1 hour max
except TimeoutError:
    print("Job taking too long, cancelling...")
    job.cancel()

Getting Results

# Wait for the job to complete before getting the result
job.wait()  # bloque jusqu'a completion

# Get the result
result = job.result()

if job.status == "running":
    print("Job still running...")
elif result.success:
    print("Job completed successfully!")
    print(result.output)
else:
    print(f"Job failed: {result.error}")

Listing Jobs

# List all jobs
jobs = session_live_gpu.jobs()
for j in jobs:
    print(f"{j.name}: {j.status()}")

# List only running jobs
running_jobs = session_live_gpu.jobs(status="running")

# List completed jobs
completed_jobs = session_live_gpu.jobs(status="completed", limit=5)

Cancelling Jobs

if job.is_running():
    job.cancel()
    print("Job cancelled")

Shell Jobs

# Submit a shell command as an async job
job = session_live_gpu.submit(
    "pip install transformers && python /home/coder/workspace/train.py",
    name="install_and_train",
    job_type="shell"
)

Jupyter Magic

Use Clouditia directly in Jupyter notebooks with magic commands.

Loading the Extension

# In a Jupyter cell
%load_ext clouditia

# Set your API key
CLOUDITIA_API_KEY = "ck_your_api_key"

Running Code on Remote Session

%%clouditia
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU: {torch.cuda.get_device_name(0)}")

x = torch.randn(1000, 1000, device='cuda')
y = torch.randn(1000, 1000, device='cuda')
z = torch.matmul(x, y)
print(f"Result shape: {z.shape}")

Specifying API Key Directly

%%clouditia ck_your_api_key
print("Hello from GPU!")

Async Mode in Jupyter

%%clouditia --async
for epoch in range(100):
    print(f"Epoch {epoch}")
    # ... training code ...

# The job is submitted and _clouditia_job variable is set

# Check job status
_clouditia_job.status()

# View logs
print(_clouditia_job.logs())

Utility Magic Commands

# Check session status
%clouditia_status

# List recent jobs
%clouditia_jobs

# List only running jobs
%clouditia_jobs running

Error Handling

The SDK provides specific exceptions for different error types:

from clouditia import (
    GPUSession,
    ClouditiaError,
    AuthenticationError,
    SessionError,
    ExecutionError,
    TimeoutError,
    CommandBlockedError
)

session_live_gpu = GPUSession("ck_your_api_key")

try:
    result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
except AuthenticationError:
    print("Invalid API key")
except SessionError:
    print("Session not running or not accessible")
except ExecutionError as e:
    print(f"Code execution failed: {e}")
except TimeoutError:
    print("Execution timed out - consider using async jobs")
except CommandBlockedError:
    print("Command blocked by security filters")
except ClouditiaError as e:
    print(f"General error: {e}")

Using raise_for_status()

result = session_live_gpu.run("import torch; print(torch.cuda.is_available())")
result.raise_for_status()  # Raises ExecutionError if failed
print(result.output)

API Reference

GPUSession

GPUSession(
    api_key: str,
    base_url: str = "https://clouditia.com/code-editor",
    timeout: int = 120,
    poll_interval: int = 5
)

Methods:

Method	Description
Connection & Info
`verify()`	Verify API key and get session info
`wait_until_ready(timeout=600)`	Wait until session is fully ready (workspace restored)
`gpu_info()`	Get GPU information (name, memory, CUDA version)
Code Execution
`run(code, timeout=None, stream=True)`	Execute Python code (REPL-like: captures last expression)
`exec(code, timeout=None)`	Execute code, raises `ExecutionError` on failure
`shell(command, timeout=None)`	Execute shell command (security-filtered)
Persistent Mode
`start()`	Start a persistent session (variables persist between calls)
`stop()`	Stop the persistent session
`set(name, value)`	Send a variable to the remote session
`get(name)`	Retrieve a variable from the remote session
File Transfer
`upload(local_path, remote_path, show_progress=True)`	Upload a file (auto-chunked for large files)
`download(remote_path, local_path, show_progress=True)`	Download a file (auto-chunked for large files)
`upload_folder(local_path, remote_path, exclude=None)`	Upload a folder (compressed + chunked)
`download_folder(remote_path, local_path, exclude=None)`	Download a folder (compressed + chunked)
`list_files(remote_path, pattern=None)`	List files in remote directory
`file_exists(remote_path)`	Check if a file exists on remote
S3 Output (local)
`s3_connect(bucket, access_key, secret_key, ...)`	Create S3 connection
`output(filename, data, s3_connection)`	Save local Python object to S3
`output_file(local_path, s3_connection, remote_filename=None)`	Upload local file to S3
S3 Output (remote — no local transit)
`remote_output(filename, variable_name, s3)`	Save remote session variable directly to S3
`remote_output_file(remote_path, s3, s3_filename=None)`	Upload remote session file directly to S3
Async Jobs
`submit(code, name=None, job_type="python")`	Submit async background job
`jobs(status=None, limit=10)`	List async jobs
Decorator
`@remote`	Decorator to run a function on the remote session

Properties:

Property	Type	Description
`is_persistent`	bool	`True` if a persistent session is active
`api_key`	str	The API key used for authentication
`base_url`	str	The API base URL
`timeout`	int	Default timeout in seconds

ExecutionResult

ExecutionResult(
    output: str,      # all output (print statements + last expression)
    result: Any,      # value of the last line if it's an expression (None otherwise)
    error: str,       # error message if failed
    exit_code: int,   # process exit code
    success: bool     # True if execution succeeded
)

Difference between output and result:

output = all stdout + last expression (like a Jupyter cell)
result = only the last expression for programmatic use (e.g., int(result.result))

Methods:

Method	Description
`raise_for_status()`	Raise exception if failed
`to_dict()`	Convert to dictionary
`__bool__()`	`True` if `success=True` (use in `if result:`)
`__str__()`	Returns `output` if success, `"Error: ..."` otherwise

AsyncJob

AsyncJob(session, job_id, name=None)

Methods:

Method	Description
`status()`	Get current status
`is_done()`	Check if finished
`is_running()`	Check if running
`is_pending()`	Check if pending
`logs(tail=50, new_only=False)`	Get logs
`result()`	Get final result
`cancel()`	Cancel the job
`wait(timeout=None, show_logs=False)`	Wait for completion
`get_info()`	Get detailed job info

Exceptions

Exception	Description
`ClouditiaError`	Base exception for all Clouditia errors
`AuthenticationError`	Invalid or expired API key
`SessionError`	Session not found, not running, or not accessible
`ExecutionError`	Code execution failed on the remote session
`TimeoutError`	Execution timed out
`CommandBlockedError`	Command blocked by security filters

Hierarchy: All exceptions inherit from ClouditiaError.

Support

Documentation: https://clouditia.com/docsapisession
Email: support@clouditia.com

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.9.2

May 23, 2026

This version

1.9.1

May 23, 2026

1.9.0

May 23, 2026

1.8.9

May 23, 2026

1.8.8

May 23, 2026

1.8.7

May 23, 2026

1.8.6

May 23, 2026

1.8.5

May 23, 2026

1.8.4

May 23, 2026

1.8.3

May 23, 2026

1.8.2

May 23, 2026

1.8.1

May 23, 2026

1.8.0

May 23, 2026

1.7.2

May 23, 2026

1.7.1

May 23, 2026

1.7.0

May 23, 2026

1.6.7

May 23, 2026

1.6.6

May 23, 2026

1.6.5

May 23, 2026

1.6.4

May 22, 2026

1.6.3

May 22, 2026

1.6.2

May 22, 2026

1.6.1

May 22, 2026

1.6.0

May 22, 2026

1.5.5

May 22, 2026

1.5.4

May 22, 2026

1.5.3

May 22, 2026

1.5.2

May 5, 2026

1.5.1

Jan 9, 2026

1.5.0

Jan 9, 2026

1.4.0

Jan 8, 2026

1.3.0

Jan 6, 2026

1.2.4

Jan 5, 2026

1.2.3

Jan 5, 2026

1.2.2

Jan 5, 2026

1.2.1

Jan 5, 2026

1.2.0

Jan 5, 2026

1.1.0

Jan 5, 2026

1.0.1

Jan 5, 2026

1.0.0

Jan 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clouditia-1.9.1.tar.gz (57.3 kB view details)

Uploaded May 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

clouditia-1.9.1-py3-none-any.whl (37.7 kB view details)

Uploaded May 23, 2026 Python 3

File details

Details for the file clouditia-1.9.1.tar.gz.

File metadata

Download URL: clouditia-1.9.1.tar.gz
Upload date: May 23, 2026
Size: 57.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for clouditia-1.9.1.tar.gz
Algorithm	Hash digest
SHA256	`c50d36590ae7c3136ea48de8f8500ffd4f036690ee599eedf01e0d044cd9f085`
MD5	`a7cca0eb2edc81a7156e4d7071d01231`
BLAKE2b-256	`56896441adf59ace71bf4bf17189413201dd4eabee667720f9eb59934eded84a`

See more details on using hashes here.

File details

Details for the file clouditia-1.9.1-py3-none-any.whl.

File metadata

Download URL: clouditia-1.9.1-py3-none-any.whl
Upload date: May 23, 2026
Size: 37.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for clouditia-1.9.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f55dc0f5dd4cff49a8d3f771e4a2efb58cfee0ba5a70fd83bee0426b8d120873`
MD5	`40297e44bc5e2e2c3d9473112ece6e86`
BLAKE2b-256	`a902af0c1e0e3cca4fd445754ba13c200df42fc5b0be8121238636ccbed026f2`

See more details on using hashes here.

clouditia 1.9.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Clouditia SDK

Installation

Quick Start

Features

Table of Contents

Getting Your API Key

Basic Usage

Connect to a Session

Waiting for a Session to Be Ready

Quick check

Blocking helper: wait_until_ready()

Persistent Sessions

Isolated Mode (Default)

Persistent Mode

Full Example

Checking Session State

Executing Python Code

Simple Execution

output vs result

Multi-line Code

run() vs exec()

Shell Commands

Checking Exit Codes

Variable Transfer

Sending Variables to session_live_gpu

Retrieving Variables from session_live_gpu

Sending Complex Objects

File Transfer

Uploading a Single File

Downloading a Single File

Uploading a Folder

Downloading a Folder

Listing Remote Files

Checking if a File Exists

Complete Workflow Example

Working with Different File Types

S3 Output

Installation

Creating an S3 Connection

output() vs output_file()

Saving Python Objects to S3 (output)

Uploading Local Files to S3 (output_file)

Saving Remote Session Data to S3 (remote_output / remote_output_file)

Using with MinIO or Other S3-Compatible Storage

Complete Training Workflow with S3 Output

Remote Functions (Decorator)

Remote Function with Model

Async Remote Functions

Async Jobs (Long-Running Tasks)

Submitting a Job

Monitoring Progress

Real-Time Log Streaming

Waiting for Completion

Getting Results

Listing Jobs

Cancelling Jobs

Shell Jobs

Jupyter Magic

Loading the Extension

Running Code on Remote Session

Specifying API Key Directly

Async Mode in Jupyter

Utility Magic Commands

Error Handling

Using raise_for_status()

API Reference

GPUSession

ExecutionResult

AsyncJob

Exceptions

Support

Blocking helper: `wait_until_ready()`