Official Python SDK for Podstack GPU Notebook Platform

These details have not been verified by PyPI

Project links

Project description

Podstack Python SDK

Official Python SDK for the Podstack GPU Platform. Run ML workloads on remote GPUs with simple decorators, track experiments, and manage models.

Installation

pip install podstack

With optional dependencies:

pip install podstack[torch]        # PyTorch support
pip install podstack[huggingface]  # HuggingFace Transformers
pip install podstack[all]          # All ML frameworks

Quick Start

import podstack

# Initialize the SDK
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id"
)

# Run a function on a remote GPU with a single decorator
@podstack.gpu(type="L40S", fraction=100)
def train():
    import torch
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    return {"status": "done"}

result = train()  # Executes on remote GPU!

Decorators & Annotations

Podstack provides decorators that turn any Python function into a remote GPU workload with built-in experiment tracking.

`@podstack.gpu` - Remote GPU Execution

import podstack

# Basic GPU execution
@podstack.gpu(type="L40S")
def train_model():
    import torch
    model = torch.nn.Linear(768, 10).cuda()
    return {"params": sum(p.numel() for p in model.parameters())}

result = train_model()

# Specify GPU type, count, and fraction
@podstack.gpu(type="A100-80G", count=2, fraction=100)
def train_large_model():
    import torch
    print(f"GPUs available: {torch.cuda.device_count()}")

# Install pip packages on the fly
@podstack.gpu(type="L40S", pip=["transformers", "datasets", "accelerate"])
def finetune_llm():
    from transformers import AutoModelForCausalLM, AutoTokenizer
    model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
    ...

# Use uv for faster package installation
@podstack.gpu(type="L40S", uv=["torch", "transformers"])
def fast_setup():
    ...

# Install from requirements.txt
@podstack.gpu(type="L40S", requirements="requirements.txt", use_uv=True)
def train_with_deps():
    ...

# Use conda packages
@podstack.gpu(type="L40S", conda="cudatoolkit=11.8")
def train_with_conda():
    ...

# Use a pre-built environment
@podstack.gpu(type="L40S", env="nlp")
def nlp_task():
    ...

# Set execution timeout (default: 3600s)
@podstack.gpu(type="L40S", timeout=7200)
def long_training():
    ...

# Disable remote execution (run locally for debugging)
@podstack.gpu(type="L40S", remote=False)
def debug_locally():
    print("This runs on your local machine")

# Use as a context manager
with podstack.gpu(type="A100-80G", count=2) as cfg:
    print(f"GPU config set: {cfg.type}")

Available GPU types: T4, L4, A10, L40S, A100-40G, A100-80G, H100

Available environments: ml, nlp, cv, audio, tabular, rl, scientific

`@podstack.experiment` - Experiment Tracking

import podstack

# As a decorator
@podstack.experiment(name="transformer-experiments")
def run_experiment():
    ...

# As a context manager
with podstack.experiment(name="transformer-experiments") as exp:
    print(f"Experiment ID: {exp.id}")

`@podstack.run` - Run Tracking

Automatically tracks execution time and GPU configuration.

import podstack

# As a decorator
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1", track_gpu=True)
def train():
    podstack.registry.log_params({"lr": 0.001, "batch_size": 32})
    for epoch in range(10):
        loss = 1.0 / (epoch + 1)
        podstack.registry.log_metrics({"loss": loss}, step=epoch)

# As a context manager
with podstack.run(name="training-v1") as run:
    podstack.registry.log_params({"lr": 0.001})
    podstack.registry.log_metrics({"loss": 0.5}, step=1)
    print(f"Run ID: {run.id}")

# With tags
@podstack.run(name="ablation-study", tags={"variant": "no-dropout"})
def ablation():
    ...

`@podstack.model` - Model Registration

import podstack

# Register model after function completes
@podstack.experiment(name="my-experiment")
@podstack.run(name="training-v1")
@podstack.model.register(name="my-classifier")
def train_and_save():
    import torch
    model = torch.nn.Linear(768, 10)
    torch.save(model.state_dict(), "model.pt")
    podstack.registry.log_artifact("model.pt", "model")

# Promote model to production after validation
@podstack.model.promote(name="my-classifier", version=1, stage="production")
def validate_and_promote():
    # Run validation checks
    accuracy = 0.95
    assert accuracy > 0.90, "Model doesn't meet threshold"

Combining Decorators

Stack decorators for a complete ML workflow:

import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

@podstack.gpu(type="L40S", pip=["transformers", "datasets"])
@podstack.experiment(name="sentiment-analysis")
@podstack.run(name="bert-finetune-v1", track_gpu=True)
@podstack.model.register(name="sentiment-bert")
def full_pipeline():
    from transformers import AutoModelForSequenceClassification, Trainer

    model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")

    # Log hyperparameters
    podstack.registry.log_params({
        "model": "bert-base-uncased",
        "learning_rate": 2e-5,
        "epochs": 3
    })

    # Train...
    podstack.registry.log_metrics({"accuracy": 0.92, "f1": 0.89})

    return {"accuracy": 0.92}

result = full_pipeline()  # Runs on remote L40S GPU with full tracking

Registry - Experiment Tracking & Model Management

Initialize

from podstack import registry

registry.init(
    api_key="your-api-key",
    project_id="your-project-id"
)

Track Experiments and Runs

from podstack import registry

# Set experiment
registry.set_experiment("my-experiment")

# Start a tracked run
with registry.start_run(name="training-v1") as run:
    # Log hyperparameters
    registry.log_params({
        "learning_rate": 0.001,
        "batch_size": 32,
        "epochs": 10,
        "optimizer": "adam"
    })

    # Log metrics at each step
    for epoch in range(10):
        loss = train_epoch()
        accuracy = evaluate()
        registry.log_metrics({"loss": loss, "accuracy": accuracy}, step=epoch)

    # Set tags
    registry.set_tag("framework", "pytorch")

    # Log artifacts
    registry.log_artifact("model.pt", "model")
    registry.log_artifact("training_curves.png", "plots")

    # Log dataset provenance (first-class resource, deduped by content hash)
    registry.log_dataset("imdb-reviews", path="data/imdb.csv", context="training")

    # Or pass a DataFrame — schema and row/feature counts are auto-computed
    import pandas as pd
    df = pd.read_csv("data/imdb.csv")
    registry.log_dataset("imdb-reviews", df=df, context="training")

Log and Load Models

from podstack import registry

# Log a model object (auto-detects framework)
registry.log_model(model, artifact_path="model", framework="pytorch")

# Register in model registry
registry.register_model(
    name="my-classifier",
    run_id=run.id,
    description="BERT sentiment classifier"
)

# Promote to production
registry.set_model_stage("my-classifier", version=1, stage="production")

# Set aliases
registry.set_model_alias("my-classifier", alias="champion", version=1)

# Load model from registry
model = registry.load_model("my-classifier", stage="production")

Compare Runs

from podstack import registry

# Compare multiple runs
comparison = registry.compare_runs(
    run_ids=["run-id-1", "run-id-2", "run-id-3"],
    metric_keys=["loss", "accuracy"]
)

# Get metric history for a run
history = registry.get_metric_history("run-id-1", "loss")
for point in history:
    print(f"Step {point.step}: {point.value}")

# Search runs
runs = registry.search_runs(
    experiment_id="exp-id",
    status="completed",
    max_results=50
)

Dataset Tracking & Lineage

Podstack tracks datasets as first-class resources, linking them to runs and model versions so you can always answer "what data was this model trained on?"

The lineage chain is:

Dataset(s) ──[logged to]──▶ Run ──[run_id]──▶ ModelVersion

`log_dataset()` — log a dataset to the active run

dataset = registry.log_dataset(
    name="imdb-reviews",          # required — human-readable name
    path="data/imdb.csv",         # local path or URI (s3://, gcs://, https://)
    context="training",           # "training" | "validation" | "test" (default: "training")
)

The dataset is stored as a project-level resource and linked to the current run. Subsequent calls with the same file produce the same dataset record — no duplicates.

Auto-enrichment from a local file:

# SHA-256 digest is computed automatically for files ≤ 500 MB.
# This enables deduplication across runs — if two runs use the exact
# same file, they share one Dataset record in the registry.
dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv")
print(dataset.digest)  # "a3f2c1..." — hex SHA-256

Auto-enrichment from a pandas DataFrame:

import pandas as pd

df = pd.read_csv("data/imdb.csv")

dataset = registry.log_dataset(
    name="imdb-reviews",
    df=df,
    context="training",
)
# schema and profile are computed automatically:
print(dataset.schema)   # {"text": "object", "label": "int64"}
print(dataset.profile)  # {"num_rows": 50000, "num_features": 2}

Pass both path and df to get digest dedup and schema inference:

dataset = registry.log_dataset("imdb-reviews", path="data/imdb.csv", df=df)

All parameters:

Parameter	Type	Default	Description
`name`	`str`	required	Human-readable dataset name
`path`	`str`	`None`	Local file path or URI (`s3://`, `gcs://`, `https://`)
`df`	`DataFrame`	`None`	pandas DataFrame — schema and profile auto-computed
`context`	`str`	`"training"`	Role of the dataset: `"training"`, `"validation"`, or `"test"`
`digest`	`str`	`None`	SHA-256 hex digest. Computed from `path` if not provided
`source_type`	`str`	`"local"`	Storage backend: `"local"`, `"s3"`, `"gcs"`, `"url"`
`tags`	`dict`	`None`	Arbitrary string key-value tags

Returns: Dataset object with fields:

Field	Type	Description
`id`	`str`	UUID of the dataset record
`name`	`str`	Dataset name
`digest`	`str`	SHA-256 hex digest (empty if not computed)
`source_type`	`str`	Storage backend
`source`	`str`	File path or URI
`schema`	`dict`	Column → dtype mapping
`profile`	`dict`	`num_rows`, `num_features`, and any other stats
`tags`	`dict`	Tags dict
`created_at`	`str`	ISO 8601 timestamp

Via the Run object (equivalent to calling registry.log_dataset()):

with registry.start_run("training-v1") as run:
    dataset = run.log_dataset("imdb-reviews", df=df, context="training")

Multiple datasets per run

Log validation and test sets alongside the training set:

with registry.start_run("bert-finetune") as run:
    run.log_dataset("imdb-train", df=train_df, context="training")
    run.log_dataset("imdb-val",   df=val_df,   context="validation")
    run.log_dataset("imdb-test",  df=test_df,  context="test")

`get_run_datasets()` — retrieve datasets logged to a run

Returns every Dataset object linked to a run, in the order they were logged.

datasets = registry.get_run_datasets(run_id)

Parameters:

Parameter	Type	Description
`run_id`	`str`	ID of the run to query

Returns: list[Dataset] — same object as returned by log_dataset().

Fields on each Dataset:

Field	Type	Description
`id`	`str`	UUID of the dataset record
`name`	`str`	Human-readable name
`digest`	`str`	SHA-256 hex digest (empty if not computed at log time)
`source_type`	`str`	`"local"`, `"s3"`, `"gcs"`, or `"url"`
`source`	`str`	File path or URI that was passed to `log_dataset()`
`schema`	`dict`	Column → dtype mapping (e.g. `{"text": "object", "label": "int64"}`)
`profile`	`dict`	Stats dict, always contains `num_rows` and `num_features` when a DataFrame was passed
`tags`	`dict`	Key-value tags
`created_at`	`str`	ISO 8601 timestamp

Examples:

from podstack import registry

registry.init(api_key="...", project_id="...")

datasets = registry.get_run_datasets("3a9f12c4-...")

# Inspect each dataset
for ds in datasets:
    print(ds.name)
    print(f"  source : {ds.source}")
    print(f"  digest : {ds.digest[:16]}…")
    print(f"  rows   : {ds.profile.get('num_rows', 'unknown')}")
    print(f"  schema : {ds.schema}")

Checking datasets on a run you have in hand:

with registry.start_run("training-v1") as run:
    run.log_dataset("train", df=train_df, context="training")
    run.log_dataset("val",   df=val_df,   context="validation")

# After the run completes, retrieve everything that was logged
datasets = registry.get_run_datasets(run.id)
assert len(datasets) == 2

Verifying deduplication — the same physical file logged across two runs returns the same dataset ID:

ds1 = registry.get_run_datasets(run_a.id)[0]
ds2 = registry.get_run_datasets(run_b.id)[0]

# Same file → same digest → same Dataset record
assert ds1.id == ds2.id
assert ds1.digest == ds2.digest

`get_model_lineage()` — trace a model back to its training data

Returns the full provenance chain for every version of a registered model: which datasets each version was trained on, via which run.

lineage = registry.get_model_lineage(model_id)

Parameters:

Parameter	Type	Description
`model_id`	`str`	ID of the registered model

Returns: dict with the following structure:

{
  "model_id": str,
  "versions": [
    {
      "version":  int,        # version number (1, 2, 3 …)
      "stage":    str,        # "development" | "staging" | "production" | "archived"
      "run_id":   str,        # ID of the linked training run (empty if none)
      "run_name": str,        # display name of the run
      "datasets": [Dataset]   # list of Dataset dicts logged to that run
    },
    …
  ]
}

Each datasets entry has the same fields as a Dataset object (id, name, digest, source_type, source, schema, profile, tags, created_at).

Examples:

Basic iteration:

from podstack import registry

registry.init(api_key="...", project_id="...")

model   = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)

for version in lineage["versions"]:
    print(f"v{version['version']} · {version['stage']}")
    print(f"  Run: {version['run_name']} ({version['run_id'][:8]}…)")
    for ds in version["datasets"]:
        rows = ds["profile"].get("num_rows", "?")
        print(f"  └─ {ds['name']}  {rows} rows  sha256:{ds['digest'][:12]}…")

Example output:

v3 · production
  Run: bert-finetune-v3 (3a9f12c4…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
  └─ imdb-val     5000 rows  sha256:7e4b2f1a0c3d…
v2 · staging
  Run: bert-finetune-v2 (8b2e77d1…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…
v1 · archived
  Run: bert-finetune-v1 (f1c3a0e2…)
  └─ imdb-train  40000 rows  sha256:a3f2c1d8e9b0…

Finding every unique dataset ever used to train any version of a model:

lineage  = registry.get_model_lineage(model.id)
seen     = {}
for version in lineage["versions"]:
    for ds in version["datasets"]:
        seen[ds["id"]] = ds  # dedup by ID

unique_datasets = list(seen.values())
print(f"{len(unique_datasets)} unique dataset(s) across all versions")

Checking whether the production version was trained on an approved dataset:

APPROVED_DIGEST = "a3f2c1d8e9b0..."

lineage = registry.get_model_lineage(model.id)
prod = next(v for v in lineage["versions"] if v["stage"] == "production")

approved = any(ds["digest"] == APPROVED_DIGEST for ds in prod["datasets"])
print("Production model trained on approved data:", approved)

End-to-end example

import pandas as pd
from podstack import registry

registry.init(api_key="...", project_id="...")
registry.set_experiment("sentiment-analysis")

# Load data
train_df = pd.read_csv("data/train.csv")
val_df   = pd.read_csv("data/val.csv")

with registry.start_run("bert-finetune-v3") as run:
    # Log datasets — digest is auto-computed, schema inferred
    run.log_dataset("imdb-train", path="data/train.csv", df=train_df, context="training")
    run.log_dataset("imdb-val",   path="data/val.csv",   df=val_df,   context="validation")

    # Train
    run.log_params({"lr": 2e-5, "epochs": 3})
    run.log_metrics({"accuracy": 0.93, "f1": 0.92})

# Register and promote the model
registry.register_model("sentiment-bert", run_id=run.id)
registry.set_model_stage("sentiment-bert", version=3, stage="production")

# Later — answer "what data trained v3?"
model = registry.get_model("sentiment-bert")
lineage = registry.get_model_lineage(model.id)

List and Browse

from podstack import registry

# List experiments
experiments = registry.list_experiments()

# List models
models = registry.list_models()

# Download artifacts
registry.download_artifact("run-id", "model/model.pt", "./downloads/")

GPU Runner - Direct Code Execution

For running code strings directly on GPUs without decorators:

import podstack

podstack.init(api_key="your-api-key", project_id="your-project-id")

# Run code on a remote GPU
result = podstack.run_on_gpu('''
import torch
print(f"GPU: {torch.cuda.get_device_name(0)}")
print(f"Memory: {torch.cuda.get_device_properties(0).total_mem / 1e9:.1f} GB")
''', gpu="L40S")

print(result.output)
print(f"Success: {result.success}")
print(f"Duration: {result.duration_seconds}s")

Client API

For direct API access to notebooks and executions:

from podstack import Client

client = Client(api_key="your-api-key")

# Create a notebook
notebook = client.sync_create_notebook(name="experiment", gpu_type="L40S")
print(f"JupyterLab: {notebook.jupyter_url}")

# Run code
result = client.sync_run("print('Hello GPU!')", gpu_type="L40S")
print(result.output)

Error Handling

from podstack import (
    PodstackError,
    AuthenticationError,
    GPUNotAvailableError,
    RateLimitError,
    ExecutionTimeoutError
)

try:
    result = train()
except AuthenticationError:
    print("Invalid API key")
except GPUNotAvailableError as e:
    print(f"GPU not available")
except RateLimitError as e:
    print(f"Rate limited, retry after {e.retry_after}s")
except ExecutionTimeoutError as e:
    print(f"Execution timed out: {e.execution_id}")
except PodstackError as e:
    print(f"Error: {e.message}")

Configuration

import podstack

# Option 1: Initialize explicitly
podstack.init(
    api_key="your-api-key",
    project_id="your-project-id",
    api_url="https://api.podstack.ai/v1",       # optional
    registry_url="https://registry.podstack.ai"  # optional
)

# Option 2: Environment variables
# PODSTACK_API_KEY=your-api-key
# PODSTACK_PROJECT_ID=your-project-id
# PODSTACK_API_URL=https://api.podstack.ai/v1
# PODSTACK_REGISTRY_URL=https://registry.podstack.ai

# Option 3: Auto-init (set PODSTACK_AUTO_INIT=1)
# SDK auto-initializes from env vars at import time

License

MIT License - see LICENSE for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.7.0

Apr 24, 2026

1.4.0

Feb 25, 2026

1.3.21

Feb 18, 2026

1.3.20

Feb 18, 2026

1.3.19

Feb 18, 2026

1.3.18

Feb 18, 2026

1.3.17

Feb 18, 2026

1.3.16

Feb 18, 2026

1.3.15

Feb 18, 2026

1.3.14

Feb 18, 2026

This version

1.3.13

Feb 18, 2026

1.3.12

Feb 18, 2026

1.3.11

Feb 18, 2026

1.3.10

Feb 18, 2026

1.3.9

Feb 17, 2026

1.3.8

Feb 17, 2026

1.3.7

Feb 17, 2026

1.3.6

Feb 17, 2026

1.3.5

Feb 17, 2026

1.3.4

Feb 17, 2026

1.3.3

Feb 17, 2026

1.3.2

Feb 17, 2026

1.3.1

Feb 17, 2026

1.3.0

Feb 17, 2026

1.2.3

Feb 16, 2026

1.2.2

Feb 16, 2026

1.2.1

Feb 16, 2026

1.2.0

Feb 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

podstack-1.3.13.tar.gz (76.5 kB view details)

Uploaded Feb 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

podstack-1.3.13-py3-none-any.whl (81.0 kB view details)

Uploaded Feb 18, 2026 Python 3

File details

Details for the file podstack-1.3.13.tar.gz.

File metadata

Download URL: podstack-1.3.13.tar.gz
Upload date: Feb 18, 2026
Size: 76.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for podstack-1.3.13.tar.gz
Algorithm	Hash digest
SHA256	`05cfaef61ce2d01a9644edb87ceb5d3a825fad1b4b299ea510f8bb28b50b96db`
MD5	`7dff62d36ccc19162243b5987aef18d4`
BLAKE2b-256	`74ef4b7950ebef91e2c5a2e2d8acb21f72744647885ea64c6de65cd60caa022a`

See more details on using hashes here.

File details

Details for the file podstack-1.3.13-py3-none-any.whl.

File metadata

Download URL: podstack-1.3.13-py3-none-any.whl
Upload date: Feb 18, 2026
Size: 81.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.5

File hashes

Hashes for podstack-1.3.13-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7dee4f7b7fca53714fb2e43499b85c92dd597c31776aa732bb42a9f849e27f3c`
MD5	`dc85b3f7bd829a7ac6644d2b1163c9e8`
BLAKE2b-256	`1374268b1d66e137c5a1d69a66008e6ff43a4d8fc1f5c4709a119b6f7c448132`

See more details on using hashes here.

podstack 1.3.13

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Podstack Python SDK

Installation

Quick Start

Decorators & Annotations

@podstack.gpu - Remote GPU Execution

@podstack.experiment - Experiment Tracking

@podstack.run - Run Tracking

@podstack.model - Model Registration

Combining Decorators

Registry - Experiment Tracking & Model Management

Initialize

Track Experiments and Runs

Log and Load Models

Compare Runs

Dataset Tracking & Lineage

log_dataset() — log a dataset to the active run

Multiple datasets per run

get_run_datasets() — retrieve datasets logged to a run

get_model_lineage() — trace a model back to its training data

End-to-end example

List and Browse

GPU Runner - Direct Code Execution

Client API

Error Handling

Configuration

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`@podstack.gpu` - Remote GPU Execution

`@podstack.experiment` - Experiment Tracking

`@podstack.run` - Run Tracking

`@podstack.model` - Model Registration

`log_dataset()` — log a dataset to the active run

`get_run_datasets()` — retrieve datasets logged to a run

`get_model_lineage()` — trace a model back to its training data