Skip to main content

Comprehensive Long-Range Dependence Benchmarking Framework with 20 Estimators (13 Classical, 3 ML, 4 Neural Networks) + 5 Demonstration Notebooks

Project description

lrdbenchmark

Modern, reproducible benchmarking for long-range dependence (LRD) estimation across classical statistics, machine learning, and neural approaches.

License: MIT Python 3.10–3.12 Version 2.4.1 DOI


Why lrdbenchmark?

  • One interface, twenty estimators – 13 classical, 3 machine learning, and 4 neural estimators share a unified API with consistent metadata.
  • Deterministic by construction – global RNG coordination, stratified summaries, significance testing, and provenance capture are built in.
  • Runtime profiles – choose quick for smoke tests or CI, or full for exhaustive diagnostics, bootstraps, and robustness panels.
  • Production-aware workflows – supports CPU-only deployments by default with optional JAX/Numba/Torch acceleration.
  • Documentation-first tutorials – the tutorial series now ships directly in docs/tutorials/, mirrored by lightweight Markdown notebooks for interactive sessions.

Getting Started

Installation

pip install lrdbenchmark

Acceleration backends (JAX, Numba, PyTorch) are optional and auto-detected at runtime. The library falls back to NumPy when accelerators are unavailable. Install accelerators only if you need them:

pip install lrdbenchmark[accel-jax]      # JAX for GPU/TPU
pip install lrdbenchmark[accel-numba]    # Numba JIT compilation
pip install lrdbenchmark[accel-pytorch]  # PyTorch for neural estimators
pip install lrdbenchmark[accel-all]      # All accelerators

Pretrained models

Large pretrained estimators (joblib/pth files) live in a separate download channel to keep the repository lightweight. Fetch them with checksum verification whenever you need deterministic ML/NN baselines:

python tools/fetch_pretrained_models.py          # download every published artifact
python tools/fetch_pretrained_models.py --list   # inspect available keys
python tools/fetch_pretrained_models.py --models random_forest_estimator svr_estimator

By default the artefacts are cached under ~/.cache/lrdbenchmark/models. Override the location with LRDBENCHMARK_MODELS_DIR=/path/to/artifacts if you need a project-local cache (e.g., artifacts/models/ inside the repo, which is Git-ignored).

Supported environments

  • Python 3.10–3.12 across Linux, macOS, and Windows (covered in CI).
  • NumPy 2.x is the preferred runtime and receives full testing; NumPy 1.26.x remains available for legacy stacks but only receives best-effort support.
  • GPU/acceleration extras require the latest compatible backends (JAX ≥ 0.4.28, PyTorch ≥ 2.2, Numba ≥ 0.60) to ensure Python 3.12 and NumPy 2 compatibility.

Runtime configuration

  • CPU-only mode: set LRDBENCHMARK_AUTO_CPU=1 (or true / yes / on) before importing to force CPU-only JAX/CUDA visibility and avoid GPU plugin noise. When unset or 0, the package may configure JAX for CUDA if PyTorch sees a GPU.
  • Asset cache: set LRDBENCHMARK_MODELS_DIR=/path/to/cache if you want pretrained weights downloaded into a custom directory (default is ~/.cache/lrdbenchmark/models).

Command-Line Benchmarks

Run classical estimator failure analysis from the CLI:

# Quick screening (~5 min) - 3 H values, 2 lengths, 10 realizations
python scripts/benchmarks/run_classical_failure_benchmark.py --profile quick

# Standard analysis (~1 hour) - 7 H values, 3 lengths, 100 realizations
python scripts/benchmarks/run_classical_failure_benchmark.py --profile standard

# Full publication run (~8-10 hours) - 17 H values, 7 lengths, 500 realizations
python scripts/benchmarks/run_classical_failure_benchmark.py --profile full

CLI Options

Option Default Description
--profile standard quick, standard, or full
--output auto Custom output directory
--seed 42 Random seed for reproducibility
--realizations per-profile Override realization count
--checkpoint-every 100 Checkpoint frequency
--no-resume false Start fresh, ignore checkpoints
--dry-run false Show config without running

Example Workflows

# Dry-run to see configuration
python scripts/benchmarks/run_classical_failure_benchmark.py --dry-run --profile full

# Custom output and more realizations
python scripts/benchmarks/run_classical_failure_benchmark.py --profile standard \
    --output results/my_experiment --realizations 200

# Resume interrupted run (automatic)
python scripts/benchmarks/run_classical_failure_benchmark.py --profile full

Results are saved as results.csv, summary.json, and config.json in the output directory.


First Benchmark (Python API)

from lrdbenchmark import ComprehensiveBenchmark

# Quick profile skips heavy diagnostics – perfect for tests and CI
benchmark = ComprehensiveBenchmark(runtime_profile="quick")
summary = benchmark.run_comprehensive_benchmark(
    data_length=256,
    benchmark_type="classical",
    save_results=False,
)

print(summary["random_state"])
print(summary["stratified_metrics"]["hurst_bands"])

Want the full analysis (bootstrap confidence intervals, robustness panels, influence diagnostics)? Simply drop the profile override:

benchmark = ComprehensiveBenchmark()   # runtime_profile defaults to "auto"/"full"

Runtime Profiles at a Glance

Profile How to enable Designed for What is disabled
quick ComprehensiveBenchmark(runtime_profile="quick") or export LRDBENCHMARK_RUNTIME_PROFILE=quick Unit tests, CI, exploratory work Advanced metrics, bootstraps, robustness panels, heavy diagnostics
full Default when running outside pytest/quick mode End-to-end studies, publications Nothing – full diagnostics and provenance

Core Capabilities

  • Estimator families – temporal (R/S, DFA, DMA, GHE, Higuchi), spectral (Periodogram, GPH, Whittle), wavelet (CWT, variance, log-variance, wavelet Whittle), multifractal (MFDFA, wavelet leaders), machine-learning (Random Forest, SVR, Gradient Boosting), and neural (CNN, LSTM, GRU, Transformer).
  • Robust benchmarking – contamination models, adaptive preprocessing, stratified reporting, non-parametric significance tests, and provenance bundles per result.
  • Nonstationarity testing – time-varying H generators (regime switching, continuous drift, structural breaks), critical regime models (OU, fractional Lévy, SOC), and structural break detection (CUSUM, Chow test, ICSS).
  • Surrogate data testing – IAAFT, phase randomization, and AR surrogates for hypothesis testing of LRD and nonlinearity.
  • Analytics tooling – convergence analysis, bias estimation, stress panels, uncertainty calibration (including studentized bootstrap with coverage analysis), scale influence diagnostics.
  • GPU-aware execution – intelligent fallbacks (JAX ▶ Numba ▶ NumPy) with automatic CPU mode unless the user explicitly opts into GPU acceleration.
  • Containerized experiments – Docker support for reproducible cloud/HPC benchmarking.

For the full catalogue see the API reference.


Documentation & Learning Path

  • Full documentation: https://lrdbenchmark.readthedocs.io/
  • Extra guides (WSL setup, domain notes): docs/guides/
  • Tutorial sequence: docs/tutorials/ (rendered on Read the Docs, aligned with the original notebook curriculum)
  • Interactive notebooks: Markdown sources in notebooks/markdown/, easily opened via Jupytext or any Markdown-friendly notebook environment
  • Examples & scripts: runnable patterns in examples/ and scripts/

Working with the Markdown notebooks

pip install jupytext
jupytext --to notebook notebooks/markdown/02_estimation_and_validation.md
jupyter notebook notebooks/markdown/

This keeps the repository light while preserving the original interactive walkthroughs.


Project Layout

lrdbenchmark/
├── lrdbenchmark/            # Package modules
│   ├── analysis/            # Estimators, benchmarking, diagnostics
│   ├── analytics/           # Provenance, reporting, dashboards
│   ├── models/              # Data generators & contamination models
│   └── robustness/          # Adaptive preprocessing & stress tests
├── artifacts/               # (Ignored) downloaded pretrained weights
├── docs/                    # Sphinx documentation, tutorials, and extra guides
├── notebooks/               # Markdown notebooks + supporting artefacts
├── examples/                # Minimal usage examples
├── scripts/                 # Reproducible benchmarking pipelines
└── tests/                   # Pytest suite (quick profile by default)

Testing

python -m pytest                       # quick profile exercises
python -m pytest --cov=lrdbenchmark    # add coverage

After pip install -e ".[dev]", install pre-commit hooks (same checks as CI):

pre-commit install
pre-commit run --all-files   # optional

Contributing

We welcome improvements to estimators, diagnostics, documentation, and tutorials.

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/improvement
  3. Run the test suite and pre-commit (see above)
  4. Submit a pull request describing the change and relevant use-cases

See CONTRIBUTING.md for setup, Git hooks (including Cursor hooksPath), and review expectations.


Citation

@software{chin2024lrdbenchmark,
  author  = {Chin, Davian R.},
  title   = {lrdbenchmark: A Comprehensive Framework for Long-Range Dependence Estimation},
  version = {2.4.1},
  year    = {2026},
  doi     = {10.5281/zenodo.18331354},
  url     = {https://github.com/dave2k77/lrdbenchmark}
}

Licence & Support

Made with care for the time-series community. If you publish results using lrdbenchmark, please share them – the benchmarking suite evolves with real-world feedback.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lrdbenchmark-2.4.1.tar.gz (339.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lrdbenchmark-2.4.1-py3-none-any.whl (432.8 kB view details)

Uploaded Python 3

File details

Details for the file lrdbenchmark-2.4.1.tar.gz.

File metadata

  • Download URL: lrdbenchmark-2.4.1.tar.gz
  • Upload date:
  • Size: 339.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for lrdbenchmark-2.4.1.tar.gz
Algorithm Hash digest
SHA256 66bcc27f72256fd497b9b550ca7ceaf4374ffe3bcf478133e25f50e793a353e4
MD5 7dd506682d781534c6b9fdd3019f45e9
BLAKE2b-256 7f095b6750b64d032ce13b7dbd4968c31b6bcd8daf974b086ef772e4d200001f

See more details on using hashes here.

File details

Details for the file lrdbenchmark-2.4.1-py3-none-any.whl.

File metadata

  • Download URL: lrdbenchmark-2.4.1-py3-none-any.whl
  • Upload date:
  • Size: 432.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.15

File hashes

Hashes for lrdbenchmark-2.4.1-py3-none-any.whl
Algorithm Hash digest
SHA256 2333f6ad9b02140f88b0ffc46c87ac24de244198b779ee52eae8a78978dd5075
MD5 15e9933a11985d69a230c5beacc20cf7
BLAKE2b-256 6712b2238e58b690c193c6be4ac14d95122b75f22009ac0cd39508585fe98861

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page