Skip to main content

Scientific integrity linter for academic code

Project description

Demyst

de·mys·ti·fy /dēˈmistəˌfī/ — to make less obscure or confusing

PyPI version Tests Python 3.8+ License: MIT

A scientific linter for research code. Like black for formatting and mypy for types, demyst checks scientific logic.

pip install demyst
demyst analyze ./src

Status

Stability Alpha — actively developed, API may change
Python 3.8, 3.9, 3.10, 3.11, 3.12
Ecosystems NumPy, pandas, scikit-learn, PyTorch, JAX, SciPy
Philosophy Prefer false positives over silent failures — use # demyst: ignore to suppress

What It Catches

Check What It Detects Example
leakage Train/test contamination fit_transform() before train_test_split()
mirage Variance-destroying reductions np.mean() hiding outliers in your data
hypothesis P-hacking, multiple comparisons 20 t-tests without Bonferroni correction
tensor Gradient death, normalization issues Deep sigmoid chains, disabled BatchNorm stats
units Dimensional mismatches Adding meters to seconds

Try It in 30 Seconds

git clone https://github.com/Hmbown/demyst.git
cd demyst
pip install -e .
demyst leakage examples/ml_data_leakage.py

You'll see demyst catch the classic ML mistake: preprocessing data before splitting it.

Sample Output

$ demyst leakage examples/ml_data_leakage.py

──────────────────────────── Data Leakage Detected ─────────────────────────────

CRITICAL Line 47 in examples/ml_data_leakage.py
  fit_transform() called BEFORE train_test_split.
  Preprocessing learns from test data — your benchmark is invalid.

  45   X, y = load_medical_data()
  46   scaler = StandardScaler()
❱ 47   X_scaled = scaler.fit_transform(X)  # LEAKS TEST INFO
  48   X_train, X_test, y_train, y_test = train_test_split(X_scaled, y)

  Fix: Split first, then fit on train only:
       X_train, X_test = train_test_split(X)
       X_train = scaler.fit_transform(X_train)
       X_test = scaler.transform(X_test)

Summary: 1 critical issue

Quick Examples

Leakage — the #1 ML benchmarking error:

# WRONG: Leaks test statistics into training
scaler.fit_transform(X)
X_train, X_test = train_test_split(X_scaled)

# CORRECT
X_train, X_test = train_test_split(X)
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

P-Hacking — uncorrected multiple comparisons:

# 20 tests at α=0.05 expects 1 false positive
for condition in conditions:
    if ttest(a[condition], b[condition]).pvalue < 0.05:
        print(f"{condition} significant!")  # No correction applied

Usage

# Full analysis
demyst analyze your_code.py

# Individual guards
demyst mirage model.py
demyst leakage train.py
demyst hypothesis stats.py
demyst units physics.py
demyst tensor network.py

# Auto-fix mirages
demyst mirage model.py --fix

# CI mode
demyst ci . --strict

Why Mirages Matter

These documented cases show how np.mean() hides critical information:

Phenomenon What Happened
Anscombe's Quartet (1973) Four datasets with identical mean (7.5) but completely different distributions
Simpson's Paradox (Berkeley 1973) 44% male vs 35% female admission overall, but women admitted more in 4/6 departments
Fat Tails in Finance Average daily return ~0.04% hides Black Monday's -22.6% single-day crash
Outlier Masking Multiple outliers pull mean toward them, causing detection tests to fail

Run demyst mirage examples/real_world_mirages.py to see detection in action.

CI/CD

GitHub Actions:

name: Demyst
on: [push, pull_request]
jobs:
  demyst:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      - run: pip install demyst
      - run: demyst ci . --strict

Pre-commit:

If you're already using black + mypy + ruff, drop this in next to them:

repos:
  - repo: https://github.com/Hmbown/demyst
    rev: v0.1.0a1
    hooks:
      - id: demyst

See examples/configs/ for more templates.

Suppressing False Positives

Use inline comments to suppress specific warnings:

# Suppress all demyst warnings on this line
mean_value = np.mean(data)  # demyst: ignore

# Suppress only mirage warnings
dashboard_avg = np.mean(daily_views)  # demyst: ignore-mirage

# Suppress only leakage warnings  
scaler.fit_transform(X)  # demyst: ignore-leakage

Available suppressions: ignore, ignore-mirage, ignore-leakage, ignore-hypothesis, ignore-tensor, ignore-unit, ignore-all

Configuration

Create .demystrc.yaml:

profile: default  # Or: biology, physics, chemistry, economics

rules:
  mirage:
    enabled: true
    severity: critical
  leakage:
    enabled: true
    severity: critical

ignore_patterns:
  - "**/tests/**"

Programmatic API

from demyst import TensorGuard, LeakageHunter, HypothesisGuard, UnitGuard

source = open('model.py').read()
result = LeakageHunter().analyze(source)

if result['summary']['critical_count'] > 0:
    print("DATA LEAKAGE DETECTED")

Design Principles

Silent failures in research code don't crash — they produce wrong numbers that look right. A model trains, metrics look good, paper gets submitted... then someone discovers the test set leaked into training. Demyst catches these before they become retractions.

Principle What It Means
Yell early Prefers false positives over silent failures. Use # demyst: ignore to suppress.
Static analysis AST-based heuristics + light dataflow. No runtime overhead, works on any Python.
Actionable output Every warning includes the why and a concrete fix suggestion.
Escape hatches Inline suppression (# demyst: ignore-mirage), config files, CI thresholds.

Detection capabilities:

  • Mirage: Detects 80+ NumPy array creators, tracks variable flow, checks for nearby variance operations
  • Leakage: Tracks fit/fit_transform calls relative to train_test_split/cross_val_score
  • Hypothesis: Counts statistical tests, checks for correction methods, detects p-value conditionals
  • Tensor: Analyzes layer sequences for gradient death patterns, normalization misuse
  • Units: Dimensional analysis via variable naming conventions and explicit annotations

References

Phenomenon Finding Source
Anscombe's Quartet Identical means hide different distributions Anscombe (1973)
Simpson's Paradox Trends reverse when aggregated UC Berkeley (1975)
Fat Tails Normal assumptions hide crashes Mandelbrot (1963)
Retraction Stats 18.9% from computational errors PMC5395722

Resources

License

MIT — See LICENSE


"The first principle is that you must not fool yourself—and you are the easiest person to fool." — Richard Feynman

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

demyst-0.1.0a2.tar.gz (185.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

demyst-0.1.0a2-py3-none-any.whl (217.9 kB view details)

Uploaded Python 3

File details

Details for the file demyst-0.1.0a2.tar.gz.

File metadata

  • Download URL: demyst-0.1.0a2.tar.gz
  • Upload date:
  • Size: 185.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for demyst-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 1f0119d354a3557cd60a0285798620f171596cd76ee2cd27aa904334b00f0974
MD5 e8d17bcc4edf9bb3a012db8bdbb61e0f
BLAKE2b-256 eb88c6cf20044e439475e109f66dc7ac5cb03c616deb404636089415a014fb42

See more details on using hashes here.

Provenance

The following attestation bundles were made for demyst-0.1.0a2.tar.gz:

Publisher: release.yml on Hmbown/demyst

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file demyst-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: demyst-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 217.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for demyst-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 a98b2ea078e70d0ba32a09902ce2cb51c968efe20d1f841cb701d83f2f4a5975
MD5 f7ac9b600d3b03dce3a1a801bd5c4dd2
BLAKE2b-256 280f22bdf06135c58b467e03e28a6f78857ff5bbbfc0e563edf2dbb1e652b11d

See more details on using hashes here.

Provenance

The following attestation bundles were made for demyst-0.1.0a2-py3-none-any.whl:

Publisher: release.yml on Hmbown/demyst

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page