ML Systems Reproducibility Auditor

These details have not been verified by PyPI

Project links

Project description

ML Reproducibility Auditor

A systems-oriented CLI tool to evaluate machine learning repositories for reproducibility, engineering quality, and ML infrastructure signals.

Python Interface Status

Quick Demo

ml-audit https://github.com/pytorch/pytorch

Motivation

Most machine learning repositories:

Cannot be reliably reproduced
Lack dependency and environment clarity
Provide no benchmark guarantees
Hide system-level bottlenecks

This tool evaluates repositories through a systems lens, focusing on:

Reproducibility signals
Engineering maturity
ML systems design patterns

Installation

pip install -e .

Usage

ml-audit https://github.com/user/repo

GitHub Action Usage

name: ML Reproducibility Audit
on:
  pull_request:
  workflow_dispatch:
jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: OmprakashSahani/ml-repro-audit/.github/actions/ml-audit@v1
        with:
          repo-url: https://github.com/user/repo

Example Script

Run a quick audit using:

./examples/run_audit.sh

JSON Output

ml-audit https://github.com/user/repo --json

Example Output

Repository: user/repo

Structure Analysis:
- has_readme: YES
- has_license: YES
- has_ci: NO
- has_benchmarks: YES

Reproducibility Score: 7.5/10
Risk Level: MEDIUM

Code Quality Signals:
- has_pinned_dependencies: YES
- has_seed_control: NO
- has_training_loop: YES

ML Systems Detection:
- uses_pytorch: YES
- uses_distributed: YES
- uses_all_reduce: YES

Insights:
- No CI/CD detected → changes are not automatically validated
- Missing seed control → results may not be reproducible

JSON Output Example

{
  "repository": "user/repo",
  "score": 7.5,
  "risk": "MEDIUM",
  "analysis": {},
  "quality": {},
  "patterns": {},
  "insights": []
}

Features

GitHub API integration (with authentication support)
Repository structure analysis (CI/CD, benchmarks, datasets)
Code quality analysis (dependencies, determinism, training loops)
Reproducibility scoring with weighted signals
Risk classification (LOW / MEDIUM / HIGH)
ML systems pattern detection (PyTorch, distributed training, all-reduce)
Code-level inspection via GitHub API
Insight generation based on system signals
JSON output for automation and pipelines
Rich CLI interface (tables, colors)

GitHub Integration

This tool integrates directly with the GitHub API to:

Fetch repository metadata and file structure
Inspect source code for ML system patterns
Analyze engineering signals across repositories

It is designed as a developer tool to audit and improve repository quality within the GitHub ecosystem.

Use Cases

Evaluate reproducibility of ML repositories before use
Audit open-source projects for engineering quality
Compare ML infrastructure practices across repositories
Integrate into CI pipelines for repository validation

Architecture

flowchart TD
    A[CLI Input] --> B[GitHub API]
    B --> C[File Fetcher]
    C --> D[Structure Analyzer]
    C --> E[Code Quality Analyzer]
    C --> F[ML Pattern Detector]
    D --> G[Scoring Engine]
    E --> G
    G --> H[Risk Classifier]
    D --> I[Insights Generator]
    E --> I
    F --> I
    H --> J[Report Output]
    I --> J

Design Principles

Reproducibility-first — treat environment and determinism as first-class concerns
Signal over noise — focus on high-impact engineering indicators
System-aware analysis — go beyond files into behavior and patterns
Composable design — CLI + JSON for integration into workflows

Evaluation Dimensions

The scoring system considers:

Environment setup (dependencies, packaging)
Determinism (seed control)
Documentation
Testing and validation
CI/CD pipelines
Benchmarking practices
Dataset reproducibility
Configuration-driven experimentation

Roadmap

AST-based static analysis (deeper code understanding)
Dataset pipeline validation
Training loop structure detection
Performance bottleneck hints
Multi-repo comparison
Web dashboard (FastAPI)

Limitations

Heuristic-based detection (not full static analysis yet)
Partial file sampling for performance
GitHub API rate limits without authentication
Static analysis only (does not execute code)

Why This Matters

Reproducibility is a major gap in real-world ML systems.

This project explores how:

System design decisions affect reproducibility
Engineering practices impact reliability
Scalability constraints influence outcomes

Omprakash Sahani — ML Systems Engineer (Distributed Training · Optimization · Systems)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml_repro_audit-0.1.0.tar.gz (10.9 kB view details)

Uploaded May 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ml_repro_audit-0.1.0-py3-none-any.whl (10.4 kB view details)

Uploaded May 5, 2026 Python 3

File details

Details for the file ml_repro_audit-0.1.0.tar.gz.

File metadata

Download URL: ml_repro_audit-0.1.0.tar.gz
Upload date: May 5, 2026
Size: 10.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for ml_repro_audit-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`778f5471391c5643c3d5d7432cdf1757e68e86e2a4679452917f43906d9d4c28`
MD5	`519575a309403eed51a2d839acb66e69`
BLAKE2b-256	`bd0ec3845729aeee51e14592b68b64c1a5ac7ab1db38a00a467d13d64998465a`

See more details on using hashes here.

File details

Details for the file ml_repro_audit-0.1.0-py3-none-any.whl.

File metadata

Download URL: ml_repro_audit-0.1.0-py3-none-any.whl
Upload date: May 5, 2026
Size: 10.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.1

File hashes

Hashes for ml_repro_audit-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0517df0fc707c6ae513523156ada093c1cb2b6e6631323bcc542e797ee71d462`
MD5	`92dd75294b48c3f4efddb2795ddaa2d6`
BLAKE2b-256	`6579b4417c3e138fb15008b4255b232817a062bb37536be545623f8f6b94e5af`

See more details on using hashes here.

ml-repro-audit 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ML Reproducibility Auditor

A systems-oriented CLI tool to evaluate machine learning repositories for reproducibility, engineering quality, and ML infrastructure signals.

Quick Demo

Motivation

Installation

Usage

GitHub Action Usage

Example Script

JSON Output

Example Output

JSON Output Example

Features

GitHub Integration

Use Cases

Architecture

Design Principles

Evaluation Dimensions

Roadmap

Limitations

Why This Matters

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes