Skip to main content

Unified export toolkit for Weights & Biases

Project description

dr-wandb

Unified export tooling for Weights & Biases.

Installation

uv tool install dr-wandb

Or as a library:

uv add dr-wandb

Authentication

wandb login

Design Goals

  • Export W&B runs into a durable local layout that downstream repos can load without talking to the W&B API directly.
  • Keep export behavior deterministic so repeated syncs and diffs stay readable.
  • Separate current run snapshots from optional history rows while letting both live under one named export.
  • Hide raw W&B SDK object quirks behind JSON-safe stored models.

CLI

There is one public command:

wandb-export ENTITY PROJECT --name EXPORT_NAME [OPTIONS]

The command always writes into data_root / name, where name identifies one logical export. That directory is self-contained and can hold:

  • manifest.json: what was exported and where its artifacts live
  • state.json: incremental tracking state used for future syncs
  • runs.jsonl: the current latest snapshot for each run
  • history.jsonl: optional history rows when --mode history is used

Example metadata-only export:

wandb-export ml-moe moe --name moe_runs --data-root data

This writes:

data/moe_runs/
  manifest.json
  state.json
  runs.jsonl

Metadata mode writes one current snapshot per run. It does not export history rows.

Example history export:

wandb-export ml-moe moe \
  --name moe_history \
  --data-root data \
  --mode history

This writes:

data/moe_history/
  manifest.json
  state.json
  runs.jsonl
  history.jsonl

History mode writes the same current snapshots plus normalized history rows selected from scan_history.

History export can be limited with selection flags:

wandb-export ml-moe moe \
  --name moe_eval_tail \
  --data-root data \
  --mode history \
  --history-key eval/loss \
  --history-key eval/accuracy \
  --max-records 100

History selection rules to rely on:

  • history selection flags only apply in --mode history
  • --history-key narrows the metric keys requested from W&B
  • --min-step and --max-step constrain the scan window when supported by the underlying W&B API call
  • --max-records trims the final result to the most recent rows after scanning

The CLI logs progress to stderr by default. Set DR_WANDB_LOG_LEVEL to a standard Python logging level such as DEBUG, INFO, WARNING, ERROR, or CRITICAL to override that behavior for batch runs or CI. Invalid values fall back to INFO.

Sync Semantics

dr-wandb supports two sync modes:

  • SyncMode.INCREMENTAL
    • fetches newly created runs since the export's tracked max_created_at
    • refreshes previously seen runs that are not yet terminal
    • reuses prior snapshots and history rows where possible
  • SyncMode.FULL_RECONCILE
    • ignores prior incremental state
    • rebuilds the named export from scratch using the current W&B project state

Terminal run states are treated as complete enough to stop refreshing unless a full reconcile is requested.

Data Contracts

Run snapshots

runs.jsonl stores RunSnapshot records, each of which contains:

  • one normalized WandbRun
  • the exported_at timestamp for that snapshot

Raw W&B SDK objects are normalized into JSON-safe data before being written. That includes nested config-like payloads and object-valued fields such as user.

History rows

history.jsonl stores HistoryRow records. Each row keeps:

  • run_id
  • step, timestamp, and runtime when available
  • wandb_metadata from _wandb
  • metrics for non-underscore keys
  • extra for underscore-prefixed fields that are not promoted into the typed top-level columns

History rows are merged by a stable deduplication key so repeated incremental exports do not blindly append duplicates.

Library Usage

from pathlib import Path

from dr_wandb import ExportEngine, ExportMode, ExportRequest

summary = ExportEngine(
    ExportRequest(
        entity="ml-moe",
        project="moe",
        name="moe_runs",
        data_root=Path("data"),
        mode=ExportMode.METADATA,
    )
).export()

print(summary.run_count)

ExportEngine.export() is the high-level entrypoint when you want the same workflow the CLI runs and a typed ExportSummary describing the artifacts it wrote.

Loaders

Read an existing export back in with the top-level helper functions:

from pathlib import Path

from dr_wandb import load_manifest, load_run_snapshots

manifest = load_manifest("moe_runs", Path("data"))
snapshots = load_run_snapshots("moe_runs", Path("data"))

print(manifest.mode, len(snapshots))

If the export includes history rows:

from pathlib import Path

from dr_wandb import iter_history_rows

rows = list(iter_history_rows("moe_history", Path("data")))
print(len(rows))

ExportStore is also available if you want direct access to the per-export directory layout and I/O. The top-level helpers are the better default when you just want to read a completed export.

Core Concepts

  • ExportMode.METADATA exports current run snapshots only.
  • ExportMode.HISTORY exports current run snapshots plus history rows.
  • SyncMode.INCREMENTAL fetches newly created runs and refreshes tracked non-terminal runs.
  • SyncMode.FULL_RECONCILE rebuilds the export from scratch.
  • Each named export is self-contained inside data_root / name.

Development

For local iteration, run the smallest targeted test command that covers the code you changed.

Before committing:

uv run ruff format .
uv run ruff check .
uv run ty check
uv run pytest

Before publishing:

uv build

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_wandb-3.0.3.tar.gz (70.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dr_wandb-3.0.3-py3-none-any.whl (18.3 kB view details)

Uploaded Python 3

File details

Details for the file dr_wandb-3.0.3.tar.gz.

File metadata

  • Download URL: dr_wandb-3.0.3.tar.gz
  • Upload date:
  • Size: 70.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for dr_wandb-3.0.3.tar.gz
Algorithm Hash digest
SHA256 d9ae6882405fc058a26ff88397e5c51ce64ba37f5f9931777ba5964844db5670
MD5 a8361d9d1500565b88225f4cacc41773
BLAKE2b-256 71e2e68829e0a3261f8320a69cfcf01cc4634fc6183fdcd0ecd82e3bb8a6b0b3

See more details on using hashes here.

File details

Details for the file dr_wandb-3.0.3-py3-none-any.whl.

File metadata

  • Download URL: dr_wandb-3.0.3-py3-none-any.whl
  • Upload date:
  • Size: 18.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.7

File hashes

Hashes for dr_wandb-3.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 3c640cce99650f978f8e2cc31ad723ce2b1b4f7470758256672d386343e975a0
MD5 5dd7eee23a71a1dbf9c3a6b52a2e8b1a
BLAKE2b-256 1433a01af995593a7c1d1913e7a3f7218ebbe6bc3524a845723330d9b7b5c9ff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page