Rundial Python SDK (non-blocking ingest with bounded spool and ergonomic run API)

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

rundial

pip install rundial

Phase 4 introduces a non-blocking metrics transport with:

bounded in-memory queue on the training thread
background flush worker
bounded disk spool (default enabled)
gzip compression in worker transport (threshold-based)
retry with exponential backoff + jitter
diagnostics counters for dropped/accepted/retried points

CLI quickstart

Installing rundial also installs the rundial CLI:

rundial init --endpoint http://127.0.0.1:8787
rundial auth whoami
rundial target ls
rundial doctor

Operational commands:

rundial workspace ls
rundial project ls --workspace default-workspace

rundial run start --workspace default-workspace --project default-project --name baseline-001 --kind training
rundial run list --workspace default-workspace --project default-project --status running
rundial run status run_...
rundial run finish run_... --state completed

rundial metrics tail run_... --workspace default-workspace --project default-project --keys train/loss
rundial metrics export run_... \
  --workspace default-workspace \
  --project default-project \
  --keys train/loss \
  --format csv \
  --out metrics.csv
rundial logs tail run_... --workspace default-workspace --project default-project --min-level info
rundial logs export run_... \
  --workspace default-workspace \
  --project default-project \
  --format json \
  --out logs.json

All commands accept global --json output. Exit codes are stable: 0 success, 1 command or transport error, and 2 authentication/authorization failure.

CLI operational smoke, with the open-core stack running:

RUNDIAL_API_KEY=rdk_... python user_tests/cli_operational_parity_smoke.py \
  --workspace default-workspace \
  --project default-project

Start a run with workspace/project strings only:

rundial run start \
  --endpoint http://127.0.0.1:8787 \
  --workspace default-workspace \
  --project default-project \
  --name baseline-001 \
  --kind training

Config precedence:

CLI flags
env vars (RUNDIAL_API_KEY, RUNDIAL_ENDPOINT, RUNDIAL_WORKSPACE, RUNDIAL_PROJECT)
~/.config/rundial/config.toml

Quick start (recommended)

import rundial as rd

with rd.init(
    workspace="team-alpha",
    project="mnist-demo",
    name="baseline-001",
    kind="training",
    endpoint="http://127.0.0.1:8787",
    api_key="rdk_...",
    mode="online",
) as run:
    run.log({"train/loss": 0.42, "train/acc": 0.91}, step=1)
    run.log_metric("eval/loss", value=0.31, time_ms=1_760_000_000_000)
    run.log_text("starting eval loop", level="info")
    run.checkpoint("checkpoints/model.pt", step=1)

# `run.finish()` / `run.close()` finalize the run as `completed`.
# Use `run.fail(...)` or `run.abort(...)` for explicit terminal outcomes.

This slug-first mode resolves workspace/project to the canonical internal run target before run start. Use kind="agent" or kind="eval" for agent and evaluation runs; the default is kind="training".

Logs and console capture

run.log_text(message, level="info") shares the same bounded, non-blocking queue as metric logging. Messages are capped at 8 KiB, truncated lines are flagged, and queue drops are visible through run.diagnostics().

import rundial as rd

with rd.init(
    workspace="team-alpha",
    project="mnist-demo",
    name="logs-demo",
    endpoint="http://127.0.0.1:8787",
    api_key="rdk_...",
    capture_console=True,
) as run:
    print("stdout is mirrored into Rundial logs")
    run.log_text("manual warning", level="warn")

capture_console=True tees stdout as info and stderr as error. The caller still writes to the original stream, and Rundial drops-and-counts when the bounded queue is full instead of blocking the training process.

Lightweight traces

Trace spans use the same non-blocking ingest worker and disk spool as metrics and logs. Attributes and events are normalized in the worker; large prompt, completion, or tool-output values above 16 KiB are uploaded through the artifact pipe and replaced on the span with a small evidence reference.

with rd.init(
    workspace="team-alpha",
    project="mnist-demo",
    name="agent-demo",
    kind="agent",
    endpoint="http://127.0.0.1:8787",
    api_key="rdk_...",
) as run:
    with run.trace("planner.step", attrs={"phase": "plan"}) as span:
        span.event("prompt.ready", {"tokens": 128})
        span.set_attrs({"model": "example-model"})

    run.tool_call("search", input={"q": "Ada Lovelace"}, output="large tool output...")

Artifacts and checkpoints

run.log_artifact(path_or_dir, name="checkpoint") enqueues artifact work and returns before hashing or uploading files. A dedicated background uploader handles manifest hashing, pre-signed upload URLs, multipart uploads for large files, and finalization without sharing the metrics/log worker.

with rd.init(workspace="team-alpha", project="mnist-demo", api_key="rdk_...") as run:
    run.log_artifact("outputs/eval-report", name="eval-report")
    run.checkpoint("checkpoints/model.pt", step=100, keep_last=5)
    rd.checkpoint("checkpoints/model.pt", step=101)

Artifact upload jobs are journaled in the SDK spool directory and retried by the next client process if an upload is interrupted. run.checkpoint(...) and the current-run convenience rd.checkpoint(...) use artifact type checkpoint, alias latest, and a server-enforced keep-last retention policy. The API default is to keep the latest 5 finalized checkpoints per run/name when a client omits the hint; pass keep_last=K to tune it for a checkpoint call.

To consume an artifact from another run, record lineage and download through a blocking handle:

with rd.init(workspace="team-alpha", project="mnist-demo", api_key="rdk_...") as run:
    artifact = run.use_artifact("checkpoint:latest")
    artifact.download("inputs/checkpoint")

Lineage UI is still in progress for the v1 artifact milestone.

Media

run.log(...) accepts image and table helper values for common visual inspection workflows. Media bytes ride the artifact uploader, while Rundial stores only a bounded manifest row for querying and display.

with rd.init(workspace="team-alpha", project="mnist-demo", api_key="rdk_...") as run:
    run.log({"samples": rd.Image("outputs/sample-grid.png", caption="validation samples")}, step=10)
    run.log(
        {
            "predictions": rd.Table(
                columns=["id", "label", "score"],
                rows=[["img-1", "cat", 0.91], ["img-2", "dog", 0.87]],
            )
        },
        step=10,
    )

rd.Image(...) accepts filesystem paths, PIL-like objects with save(...), and uint8 numpy-like arrays shaped (height, width), (height, width, 1), (height, width, 3), or (height, width, 4). Array and PIL-like serialization happens in the artifact worker, not inside run.log(...). File-backed media jobs are replayable through the artifact journal; generated media is best-effort until the worker materializes the generated file.

Framework Integrations

Install optional framework adapters only when you need them:

pip install "rundial[integrations]"

Framework	Import	What it maps
PyTorch Lightning	`from rundial.integrations import RundialLogger`	hyperparams to run config, metrics to `run.log(...)`, checkpoints to artifacts
Hugging Face Transformers	`from rundial.integrations import RundialCallback`	Trainer args/model config to run config, logs/eval metrics to `run.log(...)`, saved checkpoints to artifacts
Keras	`from rundial.integrations import RundialKerasCallback`	fit/optimizer params to run config, epoch/batch metrics to `run.log(...)`, checkpoint paths to artifacts

The base rundial install has no hard framework dependencies. Adapter imports remain safe without Lightning, Transformers, or Keras installed; installing the extra provides the native callback base classes for framework type checks.

W&B Compatibility

For common W&B-style training scripts, swap only the import line:

import rundial.compat.wandb as wandb

The shim supports wandb.init, wandb.log, wandb.config, wandb.finish, run.summary, wandb.Image, wandb.Table, wandb.watch, wandb.define_metric, and wandb.login. Unsupported symbols raise NotImplementedError with a pointer to the compatibility table in docs/wandb-compat.md.

Resume existing runs

Use run_id with an explicit resume mode when restarting a crashed or interrupted job:

import rundial as rd

with rd.init(
    workspace="team-alpha",
    project="mnist-demo",
    run_id="run_abc123",
    resume="allow",
    endpoint="http://127.0.0.1:8787",
    api_key="rdk_...",
) as run:
    run.log({"train/loss": 0.38}, step=50)

Resume modes:

resume="never" (default): create run_id only if it does not already exist.
resume="allow": attach to a running run or create it if missing; terminal runs are not reopened.
resume="must": require an existing run; terminal runs are explicitly reopened as running.

Duplicate steps are resolved at query time. Rundial keeps raw metric rows append-only, but series queries show the latest accepted value per (runId, metricKey, step) using ingest time, with a stable row-id tie breaker. This keeps training-loop ingest fast while resumed curves remain monotonic by step.

Discovery helpers

import rundial as rd

client = rd.Client(
    endpoint="http://127.0.0.1:8787",
    api_key="rdk_...",
    spool_enabled=False,
    start_worker_on_init=False,
)
print(client.whoami())
print(client.list_workspaces())
print(client.list_projects("default-workspace"))
client.close(timeout_seconds=0.1, drain=False)

If the server does not expose /api/v1/runs/resolve-target, slug-first run start fails with an actionable stale-build error. Rebuild/restart API and retry.

Runtime notes

run.log() / run.log_metric() are non-blocking and never perform network or disk I/O.
run.log_text() and opt-in console capture use the same non-blocking queue and expose log_lines_truncated, dropped_log_lines_queue_full, and dropped_log_lines_invalid diagnostics.
system metrics are sampled by a background thread by default and logged as ordinary system/* metrics; pass system_metrics=False to rd.init(...) to opt out, or system_metrics_interval_seconds=... to tune the cadence (minimum 2 seconds).
run.finish() / run.close() flush and finalize the run; use client.close(...) when you only want to release the client transport.
NaN and infinite metric values are dropped without raising, counted in run.diagnostics().non_finite_dropped, and warn once per metric key.
disk spool is enabled by default at .rundial_spool and is bounded by size/age.
if disk spool writes fail, fallback memory buffering stays bounded and drops oldest points.
close() returns within the requested timeout plus a bounded transport wait; when it cannot send all pending points before the deadline, un-sent points are handed to the disk spool and re-sent by the next process.
run.diagnostics().pending_spooled_batches reports durable batches waiting for delivery.
worker transport can gzip large payloads (gzip_enabled, gzip_min_bytes).
use run.diagnostics() to inspect queue pressure, retries, and drop counters.
modes:
- online (default): upload in background with retries/spool fallback
- offline: buffer to spool only (no upload attempts)
- disabled: safe no-op logging for tests and dry-runs
distributed policy:
- distributed="rank0" (default): only rank 0 emits logs
- distributed="all": all ranks emit logs (use with caution for cardinality/volume)
rank detection uses common env vars (RANK, LOCAL_RANK, SLURM_PROCID, etc.); override explicitly with distributed_rank=<int>.

Backward-compatible low-level API

from rundial_sdk import RundialClient

RundialClient remains supported for advanced/manual lifecycle control.

Benchmark guardrail

Run the Phase 4 benchmark/guardrail script:

bun run test:phase4:sdk:benchmark

The command validates hot-path latency and bounded spool behavior under sustained retryable failures.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rundial

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0rc1 pre-release

Jun 17, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rundial-1.0.0rc1.tar.gz (88.0 kB view details)

Uploaded Jun 17, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rundial-1.0.0rc1-py3-none-any.whl (94.2 kB view details)

Uploaded Jun 17, 2026 Python 3

File details

Details for the file rundial-1.0.0rc1.tar.gz.

File metadata

Download URL: rundial-1.0.0rc1.tar.gz
Upload date: Jun 17, 2026
Size: 88.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rundial-1.0.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`875a95342f39b7543a3b8a95c706e0eb527152c651065f1e544a0d49cc1de571`
MD5	`53973703b49ba86623e892f0760ff19e`
BLAKE2b-256	`4908c8e3775834417e04369e9b13be7edc176288d3d9f287a26fdbf985eda81b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rundial-1.0.0rc1.tar.gz:

Publisher: release.yml on rundial-dev/rundial

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rundial-1.0.0rc1.tar.gz
- Subject digest: 875a95342f39b7543a3b8a95c706e0eb527152c651065f1e544a0d49cc1de571
- Sigstore transparency entry: 1853111744
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: rundial-dev/rundial@607f84e55e3753aba58bd48e2674c2b0b29d62b2
- Branch / Tag: refs/tags/v1.0.0-rc.1
- Owner: https://github.com/rundial-dev
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: self-hosted
- Publication workflow: release.yml@607f84e55e3753aba58bd48e2674c2b0b29d62b2
- Trigger Event: push

File details

Details for the file rundial-1.0.0rc1-py3-none-any.whl.

File metadata

Download URL: rundial-1.0.0rc1-py3-none-any.whl
Upload date: Jun 17, 2026
Size: 94.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rundial-1.0.0rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`42dd3d2eb6ca9903d8bd0149de1c2ae8bd5048778995989193e9840854b45468`
MD5	`e74d282690261261f66bca6927feb0c1`
BLAKE2b-256	`4e90260b06dfffc559adbce777ac202b211215104e6ffd52b5d2ab4f892dd581`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rundial-1.0.0rc1-py3-none-any.whl:

Publisher: release.yml on rundial-dev/rundial

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rundial-1.0.0rc1-py3-none-any.whl
- Subject digest: 42dd3d2eb6ca9903d8bd0149de1c2ae8bd5048778995989193e9840854b45468
- Sigstore transparency entry: 1853111870
- Sigstore integration time: Jun 17, 2026
Source repository:
- Permalink: rundial-dev/rundial@607f84e55e3753aba58bd48e2674c2b0b29d62b2
- Branch / Tag: refs/tags/v1.0.0-rc.1
- Owner: https://github.com/rundial-dev
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: self-hosted
- Publication workflow: release.yml@607f84e55e3753aba58bd48e2674c2b0b29d62b2
- Trigger Event: push

rundial 1.0.0rc1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

rundial

CLI quickstart

Quick start (recommended)

Logs and console capture

Lightweight traces

Artifacts and checkpoints

Media

Framework Integrations

W&B Compatibility

Resume existing runs

Discovery helpers

Runtime notes

Backward-compatible low-level API

Benchmark guardrail

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance