Policy-driven sync and export toolkit for W&B runs

These details have not been verified by PyPI

Project links

Project description

dr-wandb

Policy-driven sync, export, and update tooling for Weights & Biases.

Installation

# CLI tool
uv tool install dr-wandb

# Or as a library
uv add dr-wandb

Authentication

wandb login
# or
export WANDB_API_KEY=your_api_key_here

CLI

Export canonical project data

wandb-export ENTITY PROJECT OUTPUT_DIR [OPTIONS]

Options:
  --output-format  [parquet|jsonl]  Output format (default: parquet)
  --fetch-mode     [incremental|full_reconcile]
                                      Run selection mode (default: incremental)
  --runs-per-page  INTEGER          Runs fetched per API call (default: 500)
  --state-path     TEXT             Optional explicit sync state path
  --save-every     INTEGER          Persist state every N runs (default: 25)
  --checkpoint-every-runs INTEGER   Write checkpoint chunk every N runs (default: 25)
  --no-incremental                  Disable checkpointed export (legacy single-shot output)
  --no-finalize-compact             Keep checkpoint chunks only (skip compact final tables)
  --inspection-sample-rows INTEGER  Sample size for per-checkpoint inspection stats (default: 5)
  --policy-module  TEXT             Policy module (default: dr_wandb.sync_policy)
  --policy-class   TEXT             Policy class (default: NoopPolicy)
  --output-json    TEXT             Optional summary output path

wandb-export now uses incremental checkpointing by default. During export it writes:

OUTPUT_DIR/_checkpoints/runs/chunk-*.parquet
OUTPUT_DIR/_checkpoints/history/chunk-*.parquet
OUTPUT_DIR/_checkpoints/manifest.json
OUTPUT_DIR/_checkpoints/inspection.jsonl

The job can resume after interruption using the same --state-path, and final compact outputs are deduplicated from checkpoint chunks.

--fetch-mode incremental is now the default for wandb-export, wandb-sync, and wandb-plan-patches. In that mode dr_wandb:

fetches newly created runs with a createdAt >= last_seen_created_at filter;
revisits only runs that are still marked non-terminal in the saved state;
avoids history scans unless the active policy explicitly requests them.

Use --fetch-mode full_reconcile to force a full project rescan.

To iteratively update the existing ml-moe/moe export on this machine, rerun:

uv run wandb-export \
  ml-moe moe \
  /Users/daniellerothermel/drotherm/repos/ml-moe/data/wandb_export \
  --state-path /Users/daniellerothermel/drotherm/repos/ml-moe/data/.sync/ml_moe_moe_state.json \
  --output-json /Users/daniellerothermel/drotherm/repos/ml-moe/data/.sync/last_export_summary.json

Sync + patch workflows

wandb-sync ENTITY PROJECT --policy-module my_pkg.my_policy --policy-class MyPolicy
wandb-bootstrap-export ENTITY PROJECT ./old_export ./new_export --policy-module my_pkg.my_policy --policy-class MyPolicy
wandb-inspect-state ENTITY PROJECT --state-path ./state.json
wandb-plan-patches ENTITY PROJECT ./patches.jsonl --policy-module my_pkg.my_policy --policy-class MyPolicy
wandb-apply-patches ./patches.jsonl            # dry-run
wandb-apply-patches ./patches.jsonl --apply    # writes updates

wandb-sync and wandb-plan-patches also default to --fetch-mode incremental. Pass --fetch-mode full_reconcile when you need a full project rescan.

wandb-bootstrap-export reads an existing compact export (*_runs.*, *_history.*), rebuilds sync state locally, reapplies the active policy, and seeds a fresh output directory with a single merged checkpoint baseline. It now streams large history tables instead of materializing the whole history export in memory. Use --overwrite-output when you want to replace an existing bootstrap target directory or state file.

wandb-inspect-state reads the saved sync state and reports tracked run counts by status, including terminal, ignore, and non-terminal runs. Use --show-runs non_terminal|ignore|terminal --limit N when you want a small sample of the matching runs.

Library usage

from pathlib import Path

from dr_wandb.sync_engine import SyncEngine
from dr_wandb.sync_policy import NoopPolicy
from dr_wandb.sync_types import ExportConfig

engine = SyncEngine(policy=NoopPolicy())
summary = engine.export_project(
    ExportConfig(
        entity="my-team",
        project="my-project",
        output_dir=Path("./data"),
        output_format="parquet",
    )
)

print(summary.run_count, summary.history_count)

Loading extracted run metadata

Use the loader functions in dr_wandb.run_metadata when reading runs_raw.jsonl exports back in. Both loaders return only the latest raw snapshot per run_id, ordered by createdAt descending.

from pathlib import Path

from dr_wandb.run_metadata import (
    load_canonical_run_metadata,
    load_raw_run_record_dicts,
)

data_root = Path("./data")

canonical_runs = load_canonical_run_metadata(
    export_name="moe_runs",
    data_root=data_root,
)

raw_run_dicts = load_raw_run_record_dicts(
    export_name="moe_runs",
    data_root=data_root,
)

Choose the loader based on the shape you want:

load_canonical_run_metadata(...) returns CanonicalRunMetadata models with promoted fields like name, config, summaryMetrics, and historyKeys.
load_raw_run_record_dicts(...) returns RawRunRecord-shaped dictionaries with the original outer record fields: run_id, entity, project, exported_at, raw_run_hash, and raw_run.

These are the supported read paths for run metadata exports. They intentionally deduplicate multiple raw rows for the same run and should be preferred over line-by-line access to runs_raw.jsonl.

Core concepts

`SyncPolicy`

A policy controls data retrieval and decision logic:

select_history_keys(ctx)
select_history_window(ctx)
classify_run(ctx, history_tail)
infer_patch(ctx, history_tail)
should_update(ctx, patch)
is_terminal(ctx, decision)
on_error(ctx, exc)

Canonical export outputs

wandb-export writes:

runs table: one row per run with run payload + policy/cursor fields
history table: one row per history event with _step/_timestamp/_runtime/_wandb + metric payload
manifest JSON: schema/version, policy identity, counts, and file paths

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.0.3

Apr 14, 2026

3.0.2

Apr 14, 2026

3.0.1

Apr 14, 2026

3.0.0

Apr 12, 2026

This version

1.1.1

Mar 22, 2026

1.1.0

Mar 12, 2026

1.0.0

Mar 11, 2026

0.2.0

Feb 2, 2026

0.1.2

Nov 11, 2025

0.1.1

Sep 25, 2025

0.1.0

Sep 15, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dr_wandb-1.1.1.tar.gz (99.3 kB view details)

Uploaded Mar 22, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dr_wandb-1.1.1-py3-none-any.whl (48.5 kB view details)

Uploaded Mar 22, 2026 Python 3

File details

Details for the file dr_wandb-1.1.1.tar.gz.

File metadata

Download URL: dr_wandb-1.1.1.tar.gz
Upload date: Mar 22, 2026
Size: 99.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.0

File hashes

Hashes for dr_wandb-1.1.1.tar.gz
Algorithm	Hash digest
SHA256	`97f6af724a7d4d8cd7487289c5862e5589854e30c6cf3c1ba8e310257b95f8ce`
MD5	`c87b25b2263ce105e3c30237f423ce2b`
BLAKE2b-256	`9f8e07eb2c95df7da7246842f2d728c7d08b69c0f9e08ca27dc3c141e5f51fc0`

See more details on using hashes here.

File details

Details for the file dr_wandb-1.1.1-py3-none-any.whl.

File metadata

Download URL: dr_wandb-1.1.1-py3-none-any.whl
Upload date: Mar 22, 2026
Size: 48.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.0

File hashes

Hashes for dr_wandb-1.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`fa6e9be69e6face6f17f91e42d359ddc4b8d0a1305522eea3e8d0adffe7a7f07`
MD5	`9d3b26d783e3f759766b1e23ad7c7461`
BLAKE2b-256	`ecc2bab5c9d01e35fed0981f4465d7a7281246c717fa018e35b8158cb4627658`

See more details on using hashes here.

dr-wandb 1.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

dr-wandb

Installation

Authentication

CLI

Export canonical project data

Sync + patch workflows

Library usage

Loading extracted run metadata

Core concepts

`SyncPolicy`

Canonical export outputs

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes