Skip to main content

Python SDK and CLI for logging InstantML training runs, metrics, artifacts, and rich objects.

Project description

InstantML Python SDK

InstantML is a training-loop observability SDK for logging runs, scalar metrics, rank-aware distributed metrics, configs, tags, notes, artifacts, checkpoints, tables, histograms, media, and source context to the InstantML platform.

Install

pip install --pre instantml

(The --pre flag opts in to the current alpha. Drop it once 0.1.0 ships.)

Log in

instantml login

Opens your browser, completes a device-code flow against the InstantML platform, and stores the resulting credential at ~/.instantml/credentials. The SDK reads it automatically — no env vars to manage. Same UX as wandb login, gh auth login, gcloud auth login.

instantml whoami    # confirm who you're logged in as
instantml logout    # clear the cached credential

Log a run

import os
import instantml as im

run = im.init(project="llm-7b-sft", config=cfg)
checkpoint_policy = im.CheckpointPolicy(every_steps=500)
rank = int(os.environ.get("RANK", "0"))
world_size = int(os.environ.get("WORLD_SIZE", "1"))

for step, batch in enumerate(loader):
    loss = train_step(batch)
    run.log({"loss": loss}, step=step)
    # Optional: distributed workers can log per-rank values for reducer,
    # coverage, heatmap, and outlier dashboards.
    run.log_rank_metrics(
        {"loss": loss},
        step=step,
        rank=rank,
        world_size=world_size,
        weight=len(batch),
    )
    if checkpoint_policy.should_save(step):
        save_model("./ckpt/model.pt")
        run.log_checkpoint_file("./ckpt/model.pt", step=step)

run.finish()

Fork and attach to a checkpoint retry

When the dashboard or API creates a linked fork from a checkpoint, attach the SDK to that existing run record and continue logging:

api = im.Api(base_url="http://127.0.0.1:8000", api_key="instantml_...")
child = api.fork_run("source-run-id", checkpoint_artifact_id="artifact-id", step=500)
run = im.attach_run(child["id"], base_url="http://127.0.0.1:8000", api_key="instantml_...")
run.log({"loss": 0.12}, step=501)
run.finish()

Forking creates a same-project linked run record only; it does not start training or copy metrics/artifacts. The SDK derives a stable fork idempotency key from the request body by default so retrying the same fork call returns the same child run instead of creating duplicates. attach_run() validates the target run by default and uses async uploads; use validate=False only for write-only credentials or intentionally offline attach flows, and call finish() or wait_for_processing() before short scripts exit.

Running on a remote server or CI

Skip instantml login and pass credentials explicitly. Two ways:

export INSTANTML_API_KEY=instantml_...
run = im.init(project="cartpole", api_key="instantml_...")

Get a key from Settings → API Keys in the dashboard.

Self-hosted / local development

Override the API base URL via env var or kwarg:

export INSTANTML_API_BASE_URL=http://127.0.0.1:8000
run = im.init(
    project="cartpole",
    base_url="http://127.0.0.1:8000",
    api_key="instantml_...",
)

Shadow Weights & Biases

If you're migrating from W&B and want to compare numbers side-by-side, pass shadow_wandb=True to init. Every log, finish, and log_artifact call is mirrored to a parallel wandb.Run, using your existing WANDB_API_KEY / WANDB_ENTITY env vars. wandb.init runs on a background thread so InstantML's init stays sub-millisecond.

run = im.init(project="llm-7b-sft", config=cfg, shadow_wandb=True)

Override the W&B project or entity independently:

run = im.init(
    project="llm-7b-sft",
    shadow_wandb={"project": "llm-experiments", "entity": "my-team"},
)

Attach to an already-initialized wandb.Run:

import wandb
wb_run = wandb.init(project="llm-7b-sft")
run = im.init(project="llm-7b-sft", shadow_wandb=wb_run)

If wandb is not installed or wandb.init fails, shadow logging is disabled with a warning and InstantML logging continues unaffected.

Optional extras

The core package has no required third-party runtime dependencies. Install extras for richer local conversions and system metrics:

pip install "instantml[media]"     # Pillow, imageio, moviepy, soundfile
pip install "instantml[system]"    # psutil, pynvml
pip install "instantml[all]"

source_tracking=True uses privacy-safe defaults: entrypoint basename, git availability/commit/dirty state, Python version, and platform. Pass im.SourceTracking(...) to opt into argv, cwd/repo root, branch, host/pid, and safe git diff summary/digest capture; raw patch text is not stored in run metadata.

The SDK also ships a process-isolated spool uploader for high-throughput offline replay:

instantml-uploader --spool-dir .instantml/spool

By default, instantml.init() uses durable async metric/log uploads:

run = instantml.init(project="cartpole")
run.log_metrics({"train/reward": 100.0}, step=1)
run.log_stdout("step=1 reward=100.0")
run.wait_for_submission(timeout=30)
run.finish(timeout=30)

Async mode stores scalar metrics, rank metrics, console logs, and final status in a per-run SQLite WAL queue, then drains that queue in a background uploader process. Network and delivery errors are surfaced through run.upload_status() and warnings instead of raising from the hot logging path. Pass upload_mode="sync" when a script or CI check needs immediate foreground API errors from metric/log calls. Pass queue_dir="..." to move the default .instantml/async local queue. Orphaned queues can be recovered with the same environment or instantml login credentials:

instantml-uploader --queue-dir .instantml/async

License

Apache 2.0 — see LICENSE. The InstantML hosted backend (dashboard, API, storage) is a separate commercial offering; the SDK in this package is open source.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

instantml-0.1.0a2.tar.gz (56.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

instantml-0.1.0a2-py3-none-any.whl (50.5 kB view details)

Uploaded Python 3

File details

Details for the file instantml-0.1.0a2.tar.gz.

File metadata

  • Download URL: instantml-0.1.0a2.tar.gz
  • Upload date:
  • Size: 56.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for instantml-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 5f2fd1af3d13efeb8f95b68208e3b80546a443068a34b4b42da78fb9e30fe0b3
MD5 566c765178d772c20927ae7a2c1cb7f5
BLAKE2b-256 9f99e1289843000cda467c101e4260e9757a57d4df10338fee00dc639d59a63c

See more details on using hashes here.

Provenance

The following attestation bundles were made for instantml-0.1.0a2.tar.gz:

Publisher: python-sdk-release.yml on InstantML/monorepo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file instantml-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: instantml-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 50.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for instantml-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 d03e259953dfe444c3c53eff72f2d170aaf9866916dd65b102dba312c4ac269c
MD5 a8d6a7d6e6d606a8e610b9844bdddbab
BLAKE2b-256 e6f796798b41dcfe6d8035f04608873fcbb36eb3ed03e64e2d6696bec63cbc68

See more details on using hashes here.

Provenance

The following attestation bundles were made for instantml-0.1.0a2-py3-none-any.whl:

Publisher: python-sdk-release.yml on InstantML/monorepo

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page