Skip to main content

Capture a reproducibility manifest for a run and diff two manifests.

Project description

repro-manifest

CI PyPI Python License: MIT

Wrap a run to capture a portable manifest of its environment, code, config and seeds, then diff two manifests to explain why two runs differed.

Most ad-hoc runs leave no reproducibility receipt, and they are routinely launched from a dirty working tree, so the commit hash alone does not reproduce them. Reconstructing why two runs disagree by hand wastes an afternoon. repro-manifest writes one JSON file per run and gives you a diff built to answer exactly that question.

$ repro-manifest run -o run.json -- python train.py --lr 0.1
# ... train.py runs; run.json now describes the launch

$ repro-manifest diff good.json run.json
risk  section  key                change
high  git      commit             a1b2c3d -> e4f5a6b
high  git      uncommitted_patch  none -> 38 lines
high  seeds    PYTHONHASHSEED     0 -> 1
medium  command  argv             ... --lr 0.05 -> ... --lr 0.1

Install

$ pip install repro-manifest                 # from PyPI, once released
$ pip install git+https://github.com/jmweb-org/repro-manifest   # latest, available now

No services, no account, no SDK to weave into your training loop. One command, one JSON file.

What a manifest records

Section Contents
runtime Python version, implementation, platform, interpreter path
git Commit, branch, dirty flag, changed-file count, and a patch of uncommitted changes
command The exact argv and working directory
config SHA-256 of each config file you point it at
seeds PYTHONHASHSEED, any *_SEED variable, and seeds you pass in
packages Installed distributions and versions, read from the live environment

Capturing from the live environment catches drift a lockfile missed, and the patch makes a dirty-tree run reproducible when the commit hash is not enough.

Commands

$ repro-manifest run -o run.json -- python train.py   # capture, then run
$ repro-manifest capture -o run.json                  # capture without running
$ repro-manifest capture --config configs/train.yaml  # hash a config file
$ repro-manifest show                                  # pretty-print the current manifest
$ repro-manifest diff a.json b.json                    # explain the difference
$ repro-manifest diff a.json b.json --json             # machine-readable
$ repro-manifest diff a.json b.json --check            # exit non-zero on a high-risk change

Risk levels

A change is high risk when it routinely breaks reproducibility: a different commit, a dirty or differing uncommitted patch, a changed seed, a changed config hash, a different interpreter, or a Python minor-version bump. A major-version package bump is high; other package and command changes are medium. Only high-risk changes trip --check.

This tool owns per-run provenance and seeds. For comparing the full software and hardware stack on its own, see its sibling mlenv.

Exit codes

Code Meaning
0 Ran; no high-risk change (or --check not set). For run, the command's own code
1 --check found a high-risk change
2 A manifest file was missing or invalid

License

MIT. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

repro_manifest-0.2.0.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

repro_manifest-0.2.0-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file repro_manifest-0.2.0.tar.gz.

File metadata

  • Download URL: repro_manifest-0.2.0.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for repro_manifest-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5bbe69da46cb4cc42596819eb77a1dec91d5d617ed8dbb73ec03c8803e4ed4cc
MD5 9624811ec66b82f714ed260e82244a72
BLAKE2b-256 6b2dd173c449ec4ba5f55ed22f722793853f0932af42ab2b057fef893428e813

See more details on using hashes here.

File details

Details for the file repro_manifest-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: repro_manifest-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.20 {"installer":{"name":"uv","version":"0.11.20","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for repro_manifest-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e790c7eec3bd654dabdec866b6fee1bf13f8a154e1e7161ffd0e4b39e71e23ad
MD5 24178ed9583b83663e2ce649aedc3ae4
BLAKE2b-256 912a9dbff3eab6675a92eb74f2d9ff801783db6de666e76f26eba7dd3096fbaf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page