Get your research artifact reviewer-ready before you submit: reproducibility audits, dynamic verify, auto-fix, an agent-ready fix plan, and an ACM/NeurIPS Artifact Appendix generator.
Project description
Research Repo Doctor
Find the reproducibility traps in your research repo in seconds, then auto-fix the boring gaps before reviewers hit them.
▶ Try it on any public repo (no install): https://research-repo-doctor-bckncrcwwmg6jrbsrd6btj.streamlit.app/
rrdoctor is a local CLI and GitHub Action for research code. It audits whether a repo is
reproducible, reviewable, citable, and release-ready; scaffolds safe mechanical fixes; and
turns the rest into a checklist any coding agent or human can finish.
What it catches
- "Your
--seedflag does nothing."RRD052spots code that declares a seed option but never callsrandom.seed,np.random.seed,torch.manual_seed,tf.random.set_seed, orrandom_state=seed. - "This worked on my laptop." Local-only data paths, missing data provenance, and undocumented retrieval steps.
- "The environment silently changed." Unpinned dependencies, missing runtime versions, undeclared imports, and absent dependency manifests.
- "The notebook lies." Stale outputs, out-of-order execution, checkpoint artifacts, and secret-like notebook output.
- "Reviewers cannot tell how to cite or rerun this." Missing license, citation, CI, tests, changelog, results provenance, or experiment entrypoint.
Install
Run once, without installing:
uvx rrdoctor scan .
Alternatives:
pipx run rrdoctor scan .
pip install rrdoctor
rrdoctor scan .
Developer install from source:
git clone https://github.com/Tom409114/research-repo-doctor.git
cd research-repo-doctor
python -m pip install -e ".[dev]"
rrdoctor scan .
Fix the easy gaps
Let rrdoctor create the safe scaffolding for you. It is deterministic, idempotent, and
never overwrites existing files.
rrdoctor fix . --write
It can scaffold missing governance docs, citation metadata, data/results provenance notes,
changelog entries, and common research .gitignore entries. The hard parts become a
reviewable plan:
rrdoctor plan . --output plan.md
Use with your coding agent
Paste this into Claude Code, Cursor, GitHub Copilot, or any other coding agent:
Use rrdoctor as the deterministic, offline, no-API-key grader for this research repo.
Run:
rrdoctor scan . --format json --output baseline.json
rrdoctor plan . --output plan.md
Work through plan.md without weakening rrdoctor checks.
Definition of done:
rrdoctor scan . --baseline baseline.json --fail-on-new error
The final command is the objective gate: it verifies the agent's work against the starting baseline and fails only on newly introduced errors.
Keywords: research software, reproducibility, artifact evaluation, repository audit, auto-fix, coding agents, AGENTS.md, GitHub Action, notebooks, data availability, citation metadata.
Why this matters
Research code often lands on GitHub under deadline pressure. A reviewer or future lab member finds a promising repository and then loses hours because the environment is underspecified, data paths are local, notebooks contain stale outputs, dependencies are unpinned, or the citation is unclear.
Research Repo Doctor turns those recurring release blockers into deterministic checks with concrete remediation - and, where it is safe to do so, fixes them for you. It is built to sit in the ordinary maintenance path: run locally while preparing a release, then run automatically on pull requests through GitHub Actions.
The audit runs without an AI API key, network access, or hosted service. That same determinism makes it an honest grader: it can verify fixes made by a person or a coding agent.
audit -> fix -> plan -> (your coding agent / you) -> verify -> PR
| | | |
| | rrdoctor plan rrdoctor scan --baseline
| rrdoctor fix --write --fail-on-new error
rrdoctor scan
What's new in 0.3.0
rrdoctor appendixgenerates an ACM-style Artifact Appendix skeleton and maps findings to ACM badge tiers and the NeurIPS reproducibility checklist, so you can fill the artifact paperwork before a deadline.rrdoctor verifyadds an L1 (static) / L2 (environment build) / L3 (entrypoint run) reproducibility ladder. With--runit actually resolves dependencies (uv/pip/conda/Rscript) and executes a declared entrypoint under a timeout. Only use--runon repositories you trust.- Submission profiles include
acm,neurips,icml,ml-paper,fair4rs, andjoss, with tag-based inheritance from the base tiers. - Deeper static checks include
RRD034import/manifest cross-checks (deptry-style) andRRD054hardcoded GPU/CUDA assumptions without a documented requirement. - More ecosystems: dependency/runtime checks now understand R (
DESCRIPTION,renv.lock) and Julia (Project.toml) in addition to Python and JavaScript. rrdoctor mcpexposesscan/verify/appendixas tools for coding agents (pip install 'rrdoctor[mcp]').
What's new in 0.2.0
rrdoctor fixprovides deterministic, idempotent auto-fix for common gaps (governance docs, citation metadata, data/results provenance, changelog, ignore entries). Never overwrites.rrdoctor planemits a tool-agnostic fix plan you can hand to any coding agent; every task names the deterministic check that verifies it.- Baseline gating:
rrdoctor scan --baseline report.json --fail-on-new errorfails only on newly introduced findings, so large repos can adopt the audit incrementally. rrdoctor badgeemits a Shields.io endpoint or SVG reproducibility-score badge.- First-class PR automation: the Action posts a sticky PR comment, writes a job summary,
and can attach the fix plan, using only the built-in
GITHUB_TOKEN. - New rules include unpinned dependencies, committed notebook checkpoints, pre-commit config, and an AGENTS.md task guide for agent and human contributors.
Quickstart
rrdoctor scan . # deterministic audit (Markdown report)
rrdoctor fix . --write # apply safe scaffolding for easy gaps
rrdoctor plan . --output plan.md # tool-agnostic work order for the rest
rrdoctor scan . --format json --output baseline.json --fail-on none
rrdoctor scan . --baseline baseline.json --fail-on-new error # gate regressions
Stricter gate and report file:
rrdoctor scan . --profile strict --fail-on warning --output rrdoctor-report.md
Machine-readable and agent output:
rrdoctor scan . --format sarif --output rrdoctor.sarif --fail-on none
rrdoctor scan . --format agent --output fix-plan.md
Before a submission deadline:
rrdoctor appendix . --profile acm --output ARTIFACT_APPENDIX.md # appendix + checklist mapping
rrdoctor verify . --profile neurips # L1/L2/L3 ladder (static)
rrdoctor verify . --run --timeout 600 # actually build + run (trusted repos)
Submission profiles: acm, neurips, icml, ml-paper, fair4rs, joss (alongside the
general minimal/standard/strict/ml tiers). Dependency and runtime checks also understand
R (DESCRIPTION, renv.lock) and Julia (Project.toml), not just Python and JavaScript.
The audit -> fix -> verify loop
A deterministic checker is reproducible and trustworthy but cannot write prose or judge intent. A coding agent edits well but needs a precise specification and an objective definition of done. Research Repo Doctor gives you both:
- Audit:
rrdoctor scanproduces deterministic findings. - Fix the easy ones:
rrdoctor fix --writescaffolds governance docs, citation metadata, provenance notes, a changelog, and ignore entries (idempotent, never overwriting). - Plan the rest:
rrdoctor planemits a tool-agnostic work order. Paste it into the coding agent of your choice, attach it to an issue, or work it by hand. - Verify: re-run the audit against a baseline. Because verification is deterministic and key-free, it works as an honest grader for changes from any source.
See docs/agent-workflows.md and docs/autofix.md.
GitHub Action
Add one workflow to many repositories and get consistent reproducibility reports on pull requests and pushes. The Action requires no API key.
name: Reproducibility audit
on:
pull_request:
permissions:
contents: read
pull-requests: write
jobs:
rrdoctor:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: Tom409114/research-repo-doctor@v0.2.0
with:
profile: standard
fail-on: none
comment-pr: "true" # sticky PR comment with the report
step-summary: "true" # report in the job summary
plan: "true" # attach an agent-ready fix plan
For new-finding gating and a committed baseline, see docs/pull-request-automation.md.
Example output
Research Repo Doctor Summary
Profile: standard
Score: 76/100
Errors: 1
Warnings: 5
Rules evaluated: 32
How to fix first:
- RRD030 No dependency manifest found: Add pyproject.toml, requirements.txt, or another manifest.
- RRD040 Data availability documentation missing: Add DATA.md, docs/data.md, or a README section.
Worked examples live in examples/reports/, including a fix plan and a self-scan report.
Commands
| Command | Purpose |
|---|---|
rrdoctor scan |
Run the deterministic audit; supports --baseline and --fail-on-new. |
rrdoctor fix |
Apply safe, idempotent scaffolding for common gaps (--write to apply). |
rrdoctor plan |
Emit a tool-agnostic fix plan (Markdown or JSON). |
rrdoctor verify |
Reproducibility ladder L1/L2/L3; --run actually builds and executes. |
rrdoctor appendix |
Generate an ACM Artifact Appendix + ACM/NeurIPS checklist mapping. |
rrdoctor badge |
Emit a reproducibility-score badge (Shields.io endpoint or SVG). |
rrdoctor mcp |
Run the MCP server (scan/verify/appendix as agent tools). |
rrdoctor init |
Write a documented .rrdoctor.yml. |
rrdoctor list-rules |
List all registered rules. |
rrdoctor explain RRD0xx |
Explain a rule and how to remediate it. |
rrdoctor doctor |
Self-diagnostics. |
Rule categories
Documentation, environment, data, experiments, notebooks, citation, governance, testing, CI, security, release, and metadata. The full table is in docs/checks.md; auto-fixable rules are marked there.
Reproducibility stance
Research Repo Doctor does not claim to prove a paper is reproducible. It checks release hygiene that makes reproduction possible to attempt. Reports are heuristic and should be reviewed by maintainers. Generated fixes are starting points and contain placeholders to complete before release.
Philosophy
Deterministic first. The scanner is understandable, testable, and useful with no network access. The core scanner will not add network calls, require a hosted-service API key, or fabricate adoption metrics. AI is something you bring to act on the output - never a dependency of the audit itself, and never tied to a single tool.
Configuration
version: 1
profile: standard
paths:
exclude: [".git", ".venv", "node_modules", "__pycache__"]
thresholds:
large_file_mb: 50
large_notebook_output_kb: 1024
rules:
RRD032:
enabled: false
RRD042:
severity: warning
fail_on: error
Contributing
Contributions are welcome. Start with CONTRIBUTING.md and AGENTS.md, open a rule request or false-positive report, and include a minimal fixture when possible.
Security
Do not report suspected credential exposure in a public issue. See SECURITY.md.
Citation
Use the included CITATION.cff.
License
MIT. See LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rrdoctor-0.2.3.tar.gz.
File metadata
- Download URL: rrdoctor-0.2.3.tar.gz
- Upload date:
- Size: 93.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c39d75531cecdb8d9d91784bd34604e3028858636bf0c1cb215ca5c9dd3df476
|
|
| MD5 |
70ac470b9a34279a4f717d82812ae023
|
|
| BLAKE2b-256 |
fa166b4ea80379077d7335d7aa7ecad717b10236386f0b8393435070593611a4
|
Provenance
The following attestation bundles were made for rrdoctor-0.2.3.tar.gz:
Publisher:
release.yml on Tom409114/research-repo-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rrdoctor-0.2.3.tar.gz -
Subject digest:
c39d75531cecdb8d9d91784bd34604e3028858636bf0c1cb215ca5c9dd3df476 - Sigstore transparency entry: 2012858372
- Sigstore integration time:
-
Permalink:
Tom409114/research-repo-doctor@64f05f764beb1c77cdca1da5b27ea86533f2f825 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/Tom409114
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@64f05f764beb1c77cdca1da5b27ea86533f2f825 -
Trigger Event:
release
-
Statement type:
File details
Details for the file rrdoctor-0.2.3-py3-none-any.whl.
File metadata
- Download URL: rrdoctor-0.2.3-py3-none-any.whl
- Upload date:
- Size: 63.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
059dd5354c598775cc198f19bd4c7531dcd8c17dc4696b5ae4824abb50fa1239
|
|
| MD5 |
7e9bdb0efef5e7142f9407e866d6c56a
|
|
| BLAKE2b-256 |
bd611ca72e3bf431a10bc39265dd9ddb5003ec486840a537114d6dbc8e60a7e8
|
Provenance
The following attestation bundles were made for rrdoctor-0.2.3-py3-none-any.whl:
Publisher:
release.yml on Tom409114/research-repo-doctor
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
rrdoctor-0.2.3-py3-none-any.whl -
Subject digest:
059dd5354c598775cc198f19bd4c7531dcd8c17dc4696b5ae4824abb50fa1239 - Sigstore transparency entry: 2012858479
- Sigstore integration time:
-
Permalink:
Tom409114/research-repo-doctor@64f05f764beb1c77cdca1da5b27ea86533f2f825 -
Branch / Tag:
refs/tags/v0.2.3 - Owner: https://github.com/Tom409114
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@64f05f764beb1c77cdca1da5b27ea86533f2f825 -
Trigger Event:
release
-
Statement type: