Skip to main content

Audit research codebases for reproducibility issues

Project description

repro-check

Audit any research codebase for reproducibility issues in seconds.

CI PyPI Python 3.10+ License: MIT


The problem

A simulation that ran last year won't run today because a package changed. A new PhD student can't reproduce results from a former student's thesis. A collaborating lab can't run your code because their Python or MATLAB versions differ.

This is endemic to academic computing. Most labs have no one responsible for fixing it.

repro-check gives you a reproducibility score and a prioritised fix list in seconds.


Install

pip install repro-check
# or
pipx install repro-check

Requires Python 3.10+.


Quick start

# Audit the current directory
repro-check

# Audit a specific repo
repro-check --path /path/to/research/code

# Write a Markdown report
repro-check --path /path/to/code --output report.md

# Write a JSON report (for tooling / CI integration)
repro-check --path /path/to/code --output report.json

# Auto-generate missing environment files
repro-check --path /path/to/code --fix

Example output

╭──────────────────────────────────────────╮
│ Reproducibility Audit                    │
│ Path: /home/alice/cfd-solver             │
│ Score: 52/100                            │
╰──────────────────────────────────────────╯

✗ CRITICAL (1)
  Environment — /
    No environment specification found
    Fix: Run: pip freeze > requirements.txt

⚠ HIGH (2)
  Environment — requirements.txt
    Unpinned packages: numpy, scipy, fenics
    Fix: Pin versions: pip freeze > requirements.txt
  Portability — solver/mesh.py
    Hardcoded absolute path: '/home/alice/data/mesh.msh'
    Fix: Replace with Path(__file__).parent / 'relative/path'

~ MEDIUM (3)
  Version Control — /
    No git repository found
    Fix: git init && git add . && git commit -m 'Initial commit'
  MATLAB — startup.m missing
    addpath() calls found but no startup.m to document toolbox paths
    Fix: Create startup.m listing all required toolbox paths
  Reproducibility — simulate.py
    Random operations without seed
    Fix: Add np.random.seed(42) at script start

What it checks

Check Severity
No requirements.txt / environment.yml / pyproject.toml Critical
Unpinned package versions High
Hardcoded absolute paths (/home/user/..., C:\Users\...) High
MATLAB addpath() without startup.m High
Hardcoded paths in MATLAB .m files High
No README Medium
No setup/install documentation Medium
Data file references but no data/ directory Medium
Wildcard imports (from X import *) Medium
No git version control Medium
MATLAB version not documented Medium
Unseeded random operations (Python: numpy/random/torch) Medium
Unseeded random operations (MATLAB: rand/randn/randperm) Medium
Python version not specified in README Low

--fix: auto-generate missing files

repro-check --path /your/code --fix

If critical files are missing, --fix generates them:

  • requirements.txt — from pip freeze
  • environment.yml — conda environment template with your Python version
  • SETUP.md — step-by-step setup guide template

Review each file before committing — they are starting points, not finished documents.


Who is this for?

  • Academic research labs running simulation, FEA, or CFD code (FEniCS, OpenFOAM, Abaqus, SU2)
  • Engineering R&D teams with legacy Python or MATLAB workflows
  • Research groups doing data analysis who inherited undocumented codebases
  • PhD students trying to hand over working code to their supervisor or a collaborator

Roadmap

Version Feature
v0.1 14 checks, --fix flag, JSON/Markdown output
v0.2 R / renv checks, Snakemake checks, --min-score exit code for CI
v1.0 Web UI — drop a GitHub URL, get a report

Project structure

src/repro/
├── checker.py    ← all check logic; returns ReproReport
├── cli.py        ← click CLI entry point
├── fixer.py      ← --fix: generates missing files
├── report.py     ← terminal, Markdown, JSON, PDF output
└── templates/    ← SETUP.md and environment.yml templates
tests/            ← pytest suite

Contributing

See CONTRIBUTING.md for how to add a new check.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

research_reproducability-0.1.0.tar.gz (16.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

research_reproducability-0.1.0-py3-none-any.whl (12.0 kB view details)

Uploaded Python 3

File details

Details for the file research_reproducability-0.1.0.tar.gz.

File metadata

  • Download URL: research_reproducability-0.1.0.tar.gz
  • Upload date:
  • Size: 16.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for research_reproducability-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2fff15aeb8c3e9eea159f591b0d6f3652fb0b4930ae7eaec0325bf3e4cd804dd
MD5 82e3eb078f67ebac8588530aa163f317
BLAKE2b-256 4ef08d0afe78d849ffda36ff400f841a891f962932b4c5ceba074c19201f250b

See more details on using hashes here.

Provenance

The following attestation bundles were made for research_reproducability-0.1.0.tar.gz:

Publisher: publish.yml on Nisarg2543/research-reproducability

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file research_reproducability-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for research_reproducability-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 930739b13e2662e71de3f572e61f4950678dfa7516be3f2fc701890772d99b1e
MD5 1df5aaefd46c57587678b6c3db677c02
BLAKE2b-256 214265278ff5e9aae3ee460443745fe1d4f058f0fe9a04888a8f682704a17b78

See more details on using hashes here.

Provenance

The following attestation bundles were made for research_reproducability-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Nisarg2543/research-reproducability

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page