Anomaly-Driven Correction Discovery: Physics-Constrained Symbolic Regression for Evolutionary Scientific Discovery

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

afiferdita

These details have not been verified by PyPI

Project links

DOI
Homepage

Project description

ADCD — Anomaly-Driven Correction Discovery

Physics-Constrained Symbolic Regression for Evolutionary Scientific Discovery

ADCD is a symbolic regression framework that discovers physical correction terms rather than learning equations from scratch. Given a known classical law and anomalous observations, ADCD recovers the dimensionless correction Δ that reconciles theory with experiment — mirroring how physics actually evolves.

82.8% (±7.7%) mean structural recovery across 5 random seeds, with peak 94.4% at the reference seed.
4/4 real-world structural class matches (Mercury, Lamb Shift, Muon g-2, Blackbody).
77 automated unit tests passing on Python 3.10 and 3.11.

Key Features

Correction-first paradigm — starts from a known classical law, not a blank slate; designed for anomaly-driven theory refinement where the baseline is structurally correct
Physics-gated search cascade — AST complexity, dimensional homogeneity + transcendental guardrails, and asymptotic consistency (ARC) gates screen unphysical candidates before optimization
JAX-traced L-BFGS-B optimizer — parameter-scaled differentiable fitting with multi-restart log-uniform initialization
BIC reranking — selects the most parsimonious correction over purely numerical fits
Residual feature intelligence — statistical priors (monotonicity, curvature, oscillation, decay rate, symmetry) bias the template sampler toward the correct mathematical family
Coarse empirical evaluation — data-driven pre-filter ranks gate survivors before full JAX optimization
Noise-robust — 93.3% mean at 0% noise, 91.1% at 1%, 71.1% at 5%, 68.9% at 10%

Quick Start

Installation

pip install adcd

Or install from source:

git clone https://github.com/apiprdt/PhysicsPaper.git
cd PhysicsPaper
pip install -e ".[dev]"

Usage

Running ADCD is extremely simple using the high-level scientific API:

import adcd

# 1. Load a pre-defined benchmark scenario
scenarios = adcd.get_all_scenarios()
scenario = scenarios[0]  # Relativistic Kinetic Energy

# 2. Run discovery in a single line!
result = adcd.discover_correction(scenario, max_iterations=5, proposer="mock")

print(f"Discovered correction: {result.best_expr}")
print(f"Residual NMSE: {result.best_nmse_residual:.2e}")
print(f"Parameters: {result.best_theta}")

# 3. Export LaTeX or plot residuals
print(result.export_latex())
result.plot_residuals()

For custom experimental data, use adcd.fit(...):

import numpy as np
import adcd

x = np.linspace(1.0, 5.0, 100)
X = {"x": x}
y_classical = 2.0 * x
y_observed  = 2.0 * x + 0.5 * x**2   # hidden x² correction

result = adcd.fit(
    X=X,
    y_obs=y_observed,
    y_classical=y_classical,
    limit_variable="x",
    limit_direction="0",
    correction_mode="additive"
)

result.summary()

Benchmark Results

Standard Benchmark (seed=42, Mock Proposer)

Results from run_correction_discovery.py --proposer mock (reference seed=42, 4 iterations per scenario).

Scenario	Tier	0% Noise	1% Noise	5% Noise	10% Noise
Relativistic KE	Textbook	✓	✓	✓	✓
Yukawa Gravity	Textbook	✓	✓	✓	✓
Anharmonic Spring	Textbook	✓	✓	✓	✓
Screened Coulomb	Cross-Domain	✓	✓	✗	✗
Net Radiation	Cross-Domain	✓	✓	✓	✓
Nonlinear Drag	Cross-Domain	✓	✓	✓	✓
Mystery-A (tanh²)	Synthetic	✓	✓	✓	✓
Mystery-B (sinc)	Synthetic	✓	✓	✓	✓
Mystery-C (log-quotient)	Synthetic	✓	✓	✓	✓
Overall		100%	100%	88.9%	88.9%

Note: Screened Coulomb fails at ≥5% noise because exponential decay ($e^{-r/\lambda}$) and rational saturation ($r/(r+\lambda)$) are numerically indistinguishable at the tested SNR with limited dynamic range — an information-theoretic limit, not a framework deficiency.

Multi-Seed Reproducibility

All results are reported across 5 independent random seeds (0, 7, 21, 42, 99):

Seed	Class Match Rate
0	86.1% (31/36)
7	75.0% (27/36)
21	77.8% (28/36)
42	94.4% (34/36)
99	80.6% (29/36)
Mean	82.8% ± 7.7%

Performance variation reflects stochastic template sampling in the MockProposer. Physics gates ensure that when the correct functional family is sampled, it consistently survives filtering and is selected by BIC reranking.

Real-World Physical Constants Benchmark

Synthetic-real hybrid data using experimentally validated constants from JPL DE440, NIST, and CODATA:

Physical Scenario	Discovered Correction	Converged	Class Match	NMSE
Mercury Perihelion (GR)	`θ₀·vc²`	—	✓ polynomial	1.11e-05
Hydrogen Lamb Shift (QED)	`θ₀(n/θ₁)^(-θ₂)`	✓	✓ power_law	1.82e-18
Muon g-2 (Schwinger)	`θ₀(α/π)^θ₁`	✓	✓ polynomial	7.94e-07
Blackbody (Planck)	`-1 + e^(-f/θ₁)`	—	✓ exponential	2.59e-02

All 4 scenarios achieve correct structural class identification. 2 scenarios (Lamb Shift, Muon g-2) achieve full convergence with NMSE < 10⁻⁶. Mercury and Blackbody achieve correct structural identification but quantitative convergence is limited by parametrization sensitivity and dynamic range, respectively.

PySR Comparison (fair profile: 100 iterations, maxsize 30, 60s timeout)

Method	0% Noise	1% Noise	5% Noise	10% Noise
ADCD (ours, seed=42)	9/9 (100%)	9/9 (100%)	8/9 (88.9%)	8/9 (88.9%)
PySR fair	4/9 (44.4%)	5/9 (55.6%)	1/9 (11.1%)	5/9 (55.6%)

ADCD outperforms PySR fair by 77.8 percentage points at 5% noise (88.9% vs 11.1%). A legacy fast profile (wall-clock matched) is retained in pysr_baseline_results.json for historical comparison only.

Project Structure

PhysicsPaper/
├── src/adcd/                       # Installable package
│   ├── __init__.py                 # Public API (adcd.fit, adcd.discover_correction)
│   ├── anomaly_scenarios.py        # 9 standard + 3 blind benchmark scenarios
│   ├── arc_scorer.py               # Asymptotic consistency gate (ARC)
│   ├── coarse_evaluator.py         # Coarse numerical pre-filter
│   ├── correction_orchestrator.py  # Main multi-iteration discovery loop
│   ├── dimensional_checker.py      # Dimensional homogeneity + transcendental guardrail
│   ├── jax_optimizer.py            # JAX L-BFGS-B optimizer (parameter-scaled)
│   ├── llm_proposer.py             # Mock + Gemini + OpenAI-compatible proposers
│   ├── metrics.py                  # NMSE, BIC, structural classification
│   ├── pipeline.py                 # Stage 1 filter cascade
│   ├── real_data_loader.py         # Real-world data loading (JPL, NIST, CODATA)
│   ├── real_scenarios.py           # Real-world validation scenarios
│   ├── residual_analyzer.py        # Statistical residual feature extraction
│   └── result.py                   # CorrectionResult: summary, LaTeX, plot
├── tests/                          # 58 unit + integration tests
├── paper/                          # LaTeX source (main.tex) + figures
├── run_correction_discovery.py     # Standard 9-scenario benchmark runner
├── run_real_data_benchmark.py      # Real-world physical constants benchmark
├── run_reproducibility.py          # Multi-seed reproducibility study (5 seeds)
├── run_ablation.py                 # Gate ablation study
├── run_pysr_baseline.py            # PySR comparison baseline
├── run_mlp_baseline.py             # MLP comparison baseline
├── run_misspecification_benchmark.py  # Baseline misspecification fail-safe test
├── generate_figures.py             # Paper figure generator
├── .github/workflows/              # CI (test + lint + LaTeX) and PyPI publish
├── pyproject.toml                  # PEP 517/518 build configuration
└── README.md                       # This file

Running Tests

pip install -e ".[dev]"
pytest --cov=adcd

All 77 tests pass on Python 3.10 and 3.11 (Ubuntu and Windows).

Submission & Release

Paper submission guide (GitHub Release → Zenodo → arXiv): docs/SUBMISSION_CHECKLIST_v2.1.2.md

Current release tag: v2.1.2 | Package version: 2.1.2

Reproducing Paper Results

Verify claims before citing numbers:

python scripts/verify_paper_claims.py   # expect [ALL OK]

One-command reproduction (Windows):

.\reproduce_all.ps1

Or step-by-step:

python run_correction_discovery.py --proposer mock   # Main benchmark + gate telemetry
python run_real_data_benchmark.py                    # Real-world (5 scenarios)
python run_pysr_baseline.py --profile fair           # Fair PySR comparison
python run_ablation.py                               # Gate ablation study
python run_oracle_ablation.py                        # Oracle ground-truth injection test
python run_correction_scaling.py                     # Correction magnitude sweep
python scripts/generate_experiment_report.py         # Sync experiment_results.md
python scripts/generate_efficiency_table.py          # ADCD vs PySR efficiency table
python scripts/validate_results.py                   # Consistency checks
python generate_figures.py                           # All paper figures

Proposer regimes: Mock Proposer = template-assisted recovery; Hybrid/Gemini = zero-shot discovery. Report both separately (see paper Section 4).

# LLM benchmark (requires GEMINI_API_KEY) — writes results/llm_benchmark.json
python run_llm_benchmark.py --proposer hybrid

Citing This Work

If you use ADCD in your research, please cite:

@software{erdita2026adcd,
  author    = {Erdita, Muhammad Afif},
  title     = {{Anomaly-Driven Correction Discovery (ADCD): Physics-Constrained
                Symbolic Regression for Evolutionary Scientific Discovery}},
  year      = {2026},
  publisher = {Zenodo},
  version   = {2.1.2},
  doi       = {10.5281/zenodo.20534940},
  url       = {https://doi.org/10.5281/zenodo.20534940}
}

AI Disclosure

This project was developed with assistance from Google DeepMind's Antigravity AI assistant. AI was used as a pair-programming and writing tool. All scientific content, experimental design decisions, and intellectual contributions are the author's own.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

afiferdita

These details have not been verified by PyPI

Project links

DOI
Homepage

Release history Release notifications | RSS feed

2.1.3

Jun 11, 2026

This version

2.1.2

Jun 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

adcd-2.1.2.tar.gz (82.1 kB view details)

Uploaded Jun 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

adcd-2.1.2-py3-none-any.whl (69.2 kB view details)

Uploaded Jun 11, 2026 Python 3

File details

Details for the file adcd-2.1.2.tar.gz.

File metadata

Download URL: adcd-2.1.2.tar.gz
Upload date: Jun 11, 2026
Size: 82.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adcd-2.1.2.tar.gz
Algorithm	Hash digest
SHA256	`eecd8845c73954c514fac1a60815814ddd215e1eedacf49cecd4d7b7c5a3cb50`
MD5	`53514631a280ec3377005ce4786a9858`
BLAKE2b-256	`25f1db835f408a1bdcc22ba5acdec357c51fe23462af95a2cc379283c36542dd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for adcd-2.1.2.tar.gz:

Publisher: publish.yml on apiprdt/PhysicsPaper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: adcd-2.1.2.tar.gz
- Subject digest: eecd8845c73954c514fac1a60815814ddd215e1eedacf49cecd4d7b7c5a3cb50
- Sigstore transparency entry: 1786194029
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: apiprdt/PhysicsPaper@3db06f701f32b193445df57a52ccbd3262bc0181
- Branch / Tag: refs/tags/v2.1.2
- Owner: https://github.com/apiprdt
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3db06f701f32b193445df57a52ccbd3262bc0181
- Trigger Event: release

File details

Details for the file adcd-2.1.2-py3-none-any.whl.

File metadata

Download URL: adcd-2.1.2-py3-none-any.whl
Upload date: Jun 11, 2026
Size: 69.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for adcd-2.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`deae22ae6e7667809b80802b0fab42ea192fb4ee9708a523cfe52bfaf458712e`
MD5	`a015ff55abc408db0befbaec5037b156`
BLAKE2b-256	`272a3f16bb84283ebb6a815dfdcc56d4da7659bc6f00a630fd2b99d2e4892bf1`

See more details on using hashes here.

Provenance

The following attestation bundles were made for adcd-2.1.2-py3-none-any.whl:

Publisher: publish.yml on apiprdt/PhysicsPaper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: adcd-2.1.2-py3-none-any.whl
- Subject digest: deae22ae6e7667809b80802b0fab42ea192fb4ee9708a523cfe52bfaf458712e
- Sigstore transparency entry: 1786194099
- Sigstore integration time: Jun 11, 2026
Source repository:
- Permalink: apiprdt/PhysicsPaper@3db06f701f32b193445df57a52ccbd3262bc0181
- Branch / Tag: refs/tags/v2.1.2
- Owner: https://github.com/apiprdt
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@3db06f701f32b193445df57a52ccbd3262bc0181
- Trigger Event: release

adcd 2.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

ADCD — Anomaly-Driven Correction Discovery

Key Features

Quick Start

Installation

Usage

Benchmark Results

Standard Benchmark (seed=42, Mock Proposer)

Multi-Seed Reproducibility

Real-World Physical Constants Benchmark

PySR Comparison (fair profile: 100 iterations, maxsize 30, 60s timeout)

Project Structure

Running Tests

Submission & Release

Reproducing Paper Results

Citing This Work

AI Disclosure

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance