Skip to main content

Official Python SDK for Podstack GPU Notebook Platform

Project description

Podstack Python SDK

Official Python SDK for the Podstack GPU Platform. Run ML workloads on remote GPUs with simple decorators, track experiments, and manage models.

Installation

pip install podstack

With optional dependencies:

pip install podstack[torch]        # PyTorch support
pip install podstack[huggingface]  # HuggingFace Transformers
pip install podstack[all]          # All ML frameworks

Quick Start

import podstack

# Initialize the SDK
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id"
)

# Run a function on a remote GPU with a single decorator
@podstack.gpu(type="L40S", fraction=100)
def train():
    import torch
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    return {"status": "done"}

result = train()  # Executes on remote GPU!

Decorators & Annotations

Podstack provides decorators that turn any Python function into a remote GPU workload with built-in experiment tracking.

@podstack.gpu - Remote GPU Execution

import podstack

# Basic GPU execution
@podstack.gpu(type="L40S")
def train_model():
    import torch
    model = torch.nn.Linear(768, 10).cuda()
    return {"params": sum(p.numel() for p in model.parameters())}

result = train_model()

# Specify GPU type, count, and fraction
@podstack.gpu(type="A100-80G", count=2, fraction=100)
def train_large_model():
    import torch
    print(f"GPUs available: {torch.cuda.device_count()}")

# Install pip packages on the fly
@podstack.gpu(type="L40S", pip=["transformers", "datasets", "accelerate"])
def finetune_llm():
    from transformers import AutoModelForCausalLM, AutoTokenizer
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
    ...

# Use uv for faster package installation
@podstack.gpu(type="L40S", uv=["torch", "transformers"])
def fast_setup():
    ...

# Install from requirements.txt
@podstack.gpu(type="L40S", requirements="requirements.txt", use_uv=True)
def train_with_deps():
    ...

# Use conda packages
@podstack.gpu(type="L40S", conda="cudatoolkit=11.8")
def train_with_conda():
    ...

# Use a pre-built environment
@podstack.gpu(type="L40S", env="nlp")
def nlp_task():
    ...

# Set execution timeout (default: 3600s)
@podstack.gpu(type="L40S", timeout=7200)
def long_training():
    ...

# Disable remote execution (run locally for debugging)
@podstack.gpu(type="L40S", remote=False)
def debug_locally():
    print("This runs on your local machine")

# Use as a context manager
with podstack.gpu(type="A100-80G", count=2) as cfg:
    print(f"GPU config set: {cfg.type}")

Available GPU types: T4, L4, A10, L40S, A100-40G, A100-80G, H100

Available environments: ml, nlp, cv, audio, tabular, rl, scientific

@podstack.experiment - Experiment Tracking

import podstack

# As a decorator
@podstack.experiment(name="transformer-experiments")
def run_experiment():
    ...

# As a context manager
with podstack.experiment(name="transformer-experiments") as exp:
    print(f"Experiment ID: {exp.id}")

@podstack.run - Run Tracking

Automatically tracks execution time and GPU configuration.

import podstack

# As a decorator
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1", track_gpu=True)
def train():
    podstack.registry.log_params({"lr": 0.001, "batch_size": 32})
    for epoch in range(10):
        loss = 1.0 / (epoch + 1)
        podstack.registry.log_metrics({"loss": loss}, step=epoch)

# As a context manager
with podstack.run(name="training-v1") as run:
    podstack.registry.log_params({"lr": 0.001})
    podstack.registry.log_metrics({"loss": 0.5}, step=1)
    print(f"Run ID: {run.id}")

# With tags
@podstack.run(name="ablation-study", tags={"variant": "no-dropout"})
def ablation():
    ...

@podstack.model - Model Registration

import podstack

# Register model after function completes
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1")
@podstack.model.register(name="my-classifier")
def train_and_save():
    import torch
    model = torch.nn.Linear(768, 10)
    torch.save(model.state_dict(), "model.pt")
    podstack.registry.log_artifact("model.pt", "model")

# Promote model to production after validation
@podstack.model.promote(name="my-classifier", version=1, stage="production")
def validate_and_promote():
    # Run validation checks
    accuracy = 0.95
    assert accuracy > 0.90, "Model doesn't meet threshold"

Combining Decorators

Stack decorators for a complete ML workflow:

import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

@podstack.gpu(type="L40S", pip=["transformers", "datasets"])
@podstack.experiment(name="sentiment-analysis")
@podstack.run(name="bert-finetune-v1", track_gpu=True)
@podstack.model.register(name="sentiment-bert")
def full_pipeline():
    from transformers import AutoModelForSequenceClassification, Trainer

    model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

    # Log hyperparameters
    podstack.registry.log_params({
        "model": "bert-base-uncased",
        "learning_rate": 2e-5,
        "epochs": 3
    })

    # Train...
    podstack.registry.log_metrics({"accuracy": 0.92, "f1": 0.89})

    return {"accuracy": 0.92}

result = full_pipeline()  # Runs on remote L40S GPU with full tracking

Registry - Experiment Tracking & Model Management

Initialize

from podstack import registry

registry.init(
    api_key="your-api-key",
    project_id="your-project-id"
)

Track Experiments and Runs

from podstack import registry

# Set experiment
registry.set_experiment("my-experiment")

# Start a tracked run
with registry.start_run(name="training-v1") as run:
    # Log hyperparameters
    registry.log_params({
        "learning_rate": 0.001,
        "batch_size": 32,
        "epochs": 10,
        "optimizer": "adam"
    })

    # Log metrics at each step
    for epoch in range(10):
        loss = train_epoch()
        accuracy = evaluate()
        registry.log_metrics({"loss": loss, "accuracy": accuracy}, step=epoch)

    # Set tags
    registry.set_tag("framework", "pytorch")

    # Log artifacts
    registry.log_artifact("model.pt", "model")
    registry.log_artifact("training_curves.png", "plots")

    # Log dataset provenance (first-class resource, deduped by content hash)
    registry.log_dataset("imdb-reviews", path="data/imdb.csv", context="training")

    # Or pass a DataFrame — schema and row/feature counts are auto-computed
    import pandas as pd
    df = pd.read_csv("data/imdb.csv")
    registry.log_dataset("imdb-reviews", df=df, context="training")

Log and Load Models

from podstack import registry

# Log a model object (auto-detects framework)
registry.log_model(model, artifact_path="model", framework="pytorch")

# Register in model registry
registry.register_model(
    name="my-classifier",
    run_id=run.id,
    description="BERT sentiment classifier"
)

# Promote to production
registry.set_model_stage("my-classifier", version=1, stage="production")

# Set aliases
registry.set_model_alias("my-classifier", alias="champion", version=1)

# Load model from registry
model = registry.load_model("my-classifier", stage="production")

Compare Runs

from podstack import registry

# Compare multiple runs
comparison = registry.compare_runs(
    run_ids=["run-id-1", "run-id-2", "run-id-3"],
    metric_keys=["loss", "accuracy"]
)

# Get metric history for a run
history = registry.get_metric_history("run-id-1", "loss")
for point in history:
    print(f"Step {point.step}: {point.value}")

# Search runs
runs = registry.search_runs(
    experiment_id="exp-id",
    status="completed",
    max_results=50
)

Dataset Tracking & Lineage

Podstack tracks datasets as first-class resources, linking them to runs and model versions so you can always answer "what data was this model trained on?"

The lineage chain is:

Dataset(s) ──[logged to]──▶ Run ──[run_id]──▶ ModelVersion

log_dataset() — log a dataset to the active run

dataset = registry.log_dataset(
    name="imdb-reviews",          # required — human-readable name
    path="data/imdb.csv",         # local path or URI (s3://, gcs://, https://)
    context="training",           # "training" | "validation" | "test" (default: "training")
)

The dataset is stored as a project-level resource and linked to the current run. Subsequent calls with the same file produce the same dataset record — no duplicates.

Auto-enrichment from a local file:

# SHA-256 digest is computed automatically for files ≤ 500 MB.
# This enables deduplication across runs — if two runs use the exact
# same file, they share one Dataset record in the registry.
dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv")
print(dataset.digest)  # "a3f2c1..." — hex SHA-256

Auto-enrichment from a pandas DataFrame:

import pandas as pd

df = pd.read_csv("data/imdb.csv")

dataset = registry.log_dataset(
    name="imdb-reviews",
    df=df,
    context="training",
)
# schema and profile are computed automatically:
print(dataset.schema)   # {"text": "object", "label": "int64"}
print(dataset.profile)  # {"num_rows": 50000, "num_features": 2}

Pass both path and df to get digest dedup and schema inference:

dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv", df=df)

All parameters:

Parameter Type Default Description
name str required Human-readable dataset name
path str None Local file path or URI (s3://, gcs://, https://)
df DataFrame None pandas DataFrame — schema and profile auto-computed
context str "training" Role of the dataset: "training", "validation", or "test"
digest str None SHA-256 hex digest. Computed from path if not provided
source_type str "local" Storage backend: "local", "s3", "gcs", "url"
tags dict None Arbitrary string key-value tags

Returns: Dataset object with fields:

Field Type Description
id str UUID of the dataset record
name str Dataset name
digest str SHA-256 hex digest (empty if not computed)
source_type str Storage backend
source str File path or URI
schema dict Column → dtype mapping
profile dict num_rows, num_features, and any other stats
tags dict Tags dict
created_at str ISO 8601 timestamp

Via the Run object (equivalent to calling registry.log_dataset()):

with registry.start_run("training-v1") as run:
    dataset = run.log_dataset("imdb-reviews", df=df, context="training")

Multiple datasets per run

Log validation and test sets alongside the training set:

with registry.start_run("bert-finetune") as run:
    run.log_dataset("imdb-train", df=train_df, context="training")
    run.log_dataset("imdb-val",   df=val_df,   context="validation")
    run.log_dataset("imdb-test",  df=test_df,  context="test")

get_run_datasets() — retrieve datasets logged to a run

Returns every Dataset object linked to a run, in the order they were logged.

datasets = registry.get_run_datasets(run_id)

Parameters:

Parameter Type Description
run_id str ID of the run to query

Returns: list[Dataset] — same object as returned by log_dataset().

Fields on each Dataset:

Field Type Description
id str UUID of the dataset record
name str Human-readable name
digest str SHA-256 hex digest (empty if not computed at log time)
source_type str "local", "s3", "gcs", or "url"
source str File path or URI that was passed to log_dataset()
schema dict Column → dtype mapping (e.g. {"text": "object", "label": "int64"})
profile dict Stats dict, always contains num_rows and num_features when a DataFrame was passed
tags dict Key-value tags
created_at str ISO 8601 timestamp

Examples:

from podstack import registry

registry.init(api_key="...", project_id="...")

datasets = registry.get_run_datasets("3a9f12c4-...")

# Inspect each dataset
for ds in datasets:
    print(ds.name)
    print(f"  source : {ds.source}")
    print(f"  digest : {ds.digest[:16]}…")
    print(f"  rows   : {ds.profile.get('num_rows', 'unknown')}")
    print(f"  schema : {ds.schema}")

Checking datasets on a run you have in hand:

with registry.start_run("training-v1") as run:
    run.log_dataset("train", df=train_df, context="training")
    run.log_dataset("val",   df=val_df,   context="validation")

# After the run completes, retrieve everything that was logged
datasets = registry.get_run_datasets(run.id)
assert len(datasets) == 2

Verifying deduplication — the same physical file logged across two runs returns the same dataset ID:

ds1 = registry.get_run_datasets(run_a.id)[0]
ds2 = registry.get_run_datasets(run_b.id)[0]

# Same file → same digest → same Dataset record
assert ds1.id == ds2.id
assert ds1.digest == ds2.digest

get_model_lineage() — trace a model back to its training data

Returns the full provenance chain for every version of a registered model: which datasets each version was trained on, via which run.

lineage = registry.get_model_lineage(model_id)

Parameters:

Parameter Type Description
model_id str ID of the registered model

Returns: dict with the following structure:

{
  "model_id": str,
  "versions": [
    {
      "version":  int,        # version number (1, 2, 3 …)
      "stage":    str,        # "development" | "staging" | "production" | "archived"
      "run_id":   str,        # ID of the linked training run (empty if none)
      "run_name": str,        # display name of the run
      "datasets": [Dataset]   # list of Dataset dicts logged to that run
    },
    …
  ]
}

Each datasets entry has the same fields as a Dataset object (id, name, digest, source_type, source, schema, profile, tags, created_at).

Examples:

Basic iteration:

from podstack import registry

registry.init(api_key="...", project_id="...")

model   = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)

for version in lineage["versions"]:
    print(f"v{version['version']} · {version['stage']}")
    print(f"  Run: {version['run_name']} ({version['run_id'][:8]}…)")
    for ds in version["datasets"]:
        rows = ds["profile"].get("num_rows", "?")
        print(f"  └─ {ds['name']}  {rows} rows  sha256:{ds['digest'][:12]}…")

Example output:

v3 · production
  Run: bert-finetune-v3 (3a9f12c4…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
  └─ imdb-val     5000 rows  sha256:7e4b2f1a0c3d…
v2 · staging
  Run: bert-finetune-v2 (8b2e77d1…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
v1 · archived
  Run: bert-finetune-v1 (f1c3a0e2…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…

Finding every unique dataset ever used to train any version of a model:

lineage  = registry.get_model_lineage(model.id)
seen     = {}
for version in lineage["versions"]:
    for ds in version["datasets"]:
        seen[ds["id"]] = ds  # dedup by ID

unique_datasets = list(seen.values())
print(f"{len(unique_datasets)} unique dataset(s) across all versions")

Checking whether the production version was trained on an approved dataset:

APPROVED_DIGEST = "a3f2c1d8e9b0..."

lineage = registry.get_model_lineage(model.id)
prod = next(v for v in lineage["versions"] if v["stage"] == "production")

approved = any(ds["digest"] == APPROVED_DIGEST for ds in prod["datasets"])
print("Production model trained on approved data:", approved)

End-to-end example

import pandas as pd
from podstack import registry

registry.init(api_key="...", project_id="...")
registry.set_experiment("sentiment-analysis")

# Load data
train_df = pd.read_csv("data/train.csv")
val_df   = pd.read_csv("data/val.csv")

with registry.start_run("bert-finetune-v3") as run:
    # Log datasets — digest is auto-computed, schema inferred
    run.log_dataset("imdb-train", path="data/train.csv", df=train_df, context="training")
    run.log_dataset("imdb-val",   path="data/val.csv",   df=val_df,   context="validation")

    # Train
    run.log_params({"lr": 2e-5, "epochs": 3})
    run.log_metrics({"accuracy": 0.93, "f1": 0.92})

# Register and promote the model
registry.register_model("sentiment-bert", run_id=run.id)
registry.set_model_stage("sentiment-bert", version=3, stage="production")

# Later — answer "what data trained v3?"
model = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)

List and Browse

from podstack import registry

# List experiments
experiments = registry.list_experiments()

# List models
models = registry.list_models()

# Download artifacts
registry.download_artifact("run-id", "model/model.pt", "./downloads/")

GPU Runner - Direct Code Execution

For running code strings directly on GPUs without decorators:

import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

# Run code on a remote GPU
result = podstack.run_on_gpu('''
import torch
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"Memory: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB")
''', gpu="L40S")

print(result.output)
print(f"Success: {result.success}")
print(f"Duration: {result.duration_seconds}s")

Client API

For direct API access to notebooks and executions:

from podstack import Client

client = Client(api_key="your-api-key")

# Create a notebook
notebook = client.sync_create_notebook(name="experiment", gpu_type="L40S")
print(f"JupyterLab: {notebook.jupyter_url}")

# Run code
result = client.sync_run("print('Hello GPU!')", gpu_type="L40S")
print(result.output)

Error Handling

from podstack import (
    PodstackError,
    AuthenticationError,
    GPUNotAvailableError,
    RateLimitError,
    ExecutionTimeoutError
)

try:
    result = train()
except AuthenticationError:
    print("Invalid API key")
except GPUNotAvailableError as e:
    print(f"GPU not available")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except ExecutionTimeoutError as e:
    print(f"Execution timed out: {e.execution_id}")
except PodstackError as e:
    print(f"Error: {e.message}")

Configuration

import podstack

# Option 1: Initialize explicitly
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id",
    api_url="https://api.podstack.ai/v1",       # optional
    registry_url="https://registry.podstack.ai"  # optional
)

# Option 2: Environment variables
# PODSTACK_API_KEY=your-api-key
# PODSTACK_PROJECT_ID=your-project-id
# PODSTACK_API_URL=https://api.podstack.ai/v1
# PODSTACK_REGISTRY_URL=https://registry.podstack.ai

# Option 3: Auto-init (set PODSTACK_AUTO_INIT=1)
# SDK auto-initializes from env vars at import time

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podstack-1.3.13.tar.gz (76.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

podstack-1.3.13-py3-none-any.whl (81.0 kB view details)

Uploaded Python 3

File details

Details for the file podstack-1.3.13.tar.gz.

File metadata

  • Download URL: podstack-1.3.13.tar.gz
  • Upload date:
  • Size: 76.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for podstack-1.3.13.tar.gz
Algorithm Hash digest
SHA256 05cfaef61ce2d01a9644edb87ceb5d3a825fad1b4b299ea510f8bb28b50b96db
MD5 7dff62d36ccc19162243b5987aef18d4
BLAKE2b-256 74ef4b7950ebef91e2c5a2e2d8acb21f72744647885ea64c6de65cd60caa022a

See more details on using hashes here.

File details

Details for the file podstack-1.3.13-py3-none-any.whl.

File metadata

  • Download URL: podstack-1.3.13-py3-none-any.whl
  • Upload date:
  • Size: 81.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for podstack-1.3.13-py3-none-any.whl
Algorithm Hash digest
SHA256 7dee4f7b7fca53714fb2e43499b85c92dd597c31776aa732bb42a9f849e27f3c
MD5 dc85b3f7bd829a7ac6644d2b1163c9e8
BLAKE2b-256 1374268b1d66e137c5a1d69a66008e6ff43a4d8fc1f5c4709a119b6f7c448132

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page