Fast parallel GSADF bubble detection (PSY 2015) with wild bootstrap critical values

These details have not been verified by PyPI

Project links

Project description

pygsadf

Fast parallel GSADF bubble detection (PSY 2015) with wild-bootstrap critical values.

The first Python package to deliver production-grade, parallelised Generalised Sup ADF testing. Detects explosive bubbles in financial time series at 90%, 95%, and 99% confidence levels.

Why pygsadf?

Feature	pygsadf	R `exuber`	Stata `gsadf`	EViews
Parallel bootstrap	Yes (all cores)	Limited	No	No
CV 99%	Yes	Yes	No	No
Numba JIT kernel	Yes	C++ (Rcpp)	Mata	Proprietary
Pointwise BSADF CVs	Yes	Yes	No	No
CLI tool	Yes	No	No	No
One-line API	Yes	No	No	No
Free & open source	MIT	GPL	$$	$$$

Performance: 1499 bootstrap iterations on T=3500 series in ~12 minutes on 192 cores, vs ~8 hours sequential.

Installation

pip install pygsadf              # core (NumPy + joblib)
pip install pygsadf[fast]        # + Numba JIT (10-50x faster)
pip install pygsadf[full]        # + Numba + matplotlib + statsmodels + tqdm

Quick Start

import pygsadf
import pandas as pd

# Load your log-price series
prices = pd.read_csv("eth_prices.csv", index_col=0, parse_dates=True)
log_prices = prices["close"].apply(np.log)

# Run GSADF test (one line)
result = pygsadf.gsadf(log_prices)

# Results
print(result)                    # Full summary
result.reject_h0(0.95)          # True = bubble detected at 95%
result.reject_h0(0.99)          # True = bubble detected at 99%
result.bubbles                   # List of (start, end) episodes
result.plot()                    # Publication-ready figure

# Save / load
result.to_pickle("gsadf_result.pkl")
loaded = pygsadf.GSADFResult.from_pickle("gsadf_result.pkl")

Command Line

# Full run with 1499 bootstrap replications
pygsadf --csv data.csv --col log_price --B 1499 --out result.pkl --plot bsadf.png

# Quick test (B=199)
pygsadf --csv data.csv --col close --log --B 199

# Use fewer cores
pygsadf --csv data.csv --col log_price --n-jobs 8

API Reference

`pygsadf.gsadf(y, B=1499, ...)`

Main entry point. Accepts NumPy array or pandas Series.

Parameters:

y — Log-price series (array or pd.Series with DatetimeIndex)
B — Bootstrap replications (default 1499; use 199 for quick tests)
max_lag — Maximum ADF augmentation lags (BIC selects optimal)
quantiles — Confidence levels, default (0.90, 0.95, 0.99)
seed — RNG seed for reproducibility
n_jobs — Parallel workers (-1 = all cores)

Returns: GSADFResult with:

.gsadf_stat — Scalar GSADF statistic
.cv — Dict of critical values {"90%": ..., "95%": ..., "99%": ...}
.bsadf — Full BSADF sequence (ndarray)
.bsadf_cv — Pointwise CV sequences
.bubbles — List of BubbleEpisode objects
.reject_h0(confidence) — Boolean hypothesis test
.plot() — Matplotlib figure
.summary() — Formatted text output
.to_pickle() / .from_pickle() — Serialisation

`pygsadf.wild_bootstrap_cv(y, r0, B=1499, ...)`

Low-level bootstrap function for custom workflows.

`pygsadf.date_stamp_bubbles(bsadf, cv, dates, min_duration=5)`

Date-stamp explosive episodes from BSADF vs critical value sequences.

How It Works

BSADF Computation — For each endpoint, compute the supremum of right-tailed ADF statistics over all valid start points (PSY 2015, Section 3)
GSADF — The overall supremum of the BSADF sequence
Wild Bootstrap — Generate synthetic unit-root series using Rademacher weights, compute GSADF on each, take empirical quantiles as critical values
Date-Stamping — Episodes where BSADF exceeds the pointwise 95% CV for at least log(T) consecutive days

The bootstrap is embarrassingly parallel — each replication is independent with its own deterministic RNG seed, giving identical results whether run on 1 core or 192.

Architecture

pygsadf/
├── __init__.py          # Public API
├── core.py              # gsadf() + GSADFResult
├── adf.py               # Numba-JIT ADF kernel with BIC lag selection
├── bsadf.py             # GSADF + BSADF computation
├── bootstrap.py         # Parallel wild bootstrap
├── datestamp.py          # Bubble episode detection
└── cli.py               # Command-line interface

Validation Against R `exuber`

pygsadf has been validated against the R exuber package (v0.4.2+) on a synthetic series with a known embedded explosive regime (T=500, AR coefficient 1.05 at t=200–299).

Apples-to-apples comparison (both fixed lag=1, B=999):

Metric	pygsadf	R exuber	Difference
GSADF statistic	16.370363	16.370400	0.0002%
BSADF correlation	—	—	0.9987
BSADF MAE	—	—	0.033
CV 90% (wild bootstrap)	9.016	9.425	4.3%
CV 95% (wild bootstrap)	10.270	10.760	4.6%
CV 99% (wild bootstrap)	12.628	14.096	10.4%
Reject H₀ at 95%	Yes	Yes	Match
Reject H₀ at 99%	Yes	Yes	Match

Key findings:

GSADF statistic matches to 6 decimal places (0.0002% difference)
BSADF sequence correlation: 0.999 — the entire time-varying sequence matches
All rejection decisions agree at every confidence level
CV differences (4–10%) are expected — Python and R use different RNG implementations for bootstrap Rademacher draws; the underlying distributions converge as B → ∞
pygsadf's default BIC lag selection produces higher GSADF values than exuber's fixed lag=1 default, because BIC can select lag=0 for some windows, yielding sharper test statistics. This is a methodological choice, not a discrepancy — both are valid implementations of PSY (2015)

The full validation suite is in validation/, including the synthetic dataset, both Python and R scripts, and an automated comparison tool. To reproduce:

cd validation/
python generate_test_data.py        # create validation_series.csv
python run_pygsadf_lag1.py           # pygsadf with fixed lag=1
Rscript run_exuber.R                 # R exuber (requires R + exuber package)
python compare_results.py            # side-by-side comparison

Citation

If you use pygsadf in academic work, please cite:

@software{pygsadf,
  title  = {pygsadf: Fast Parallel GSADF Bubble Detection},
  author = {Madkhali, Ali},
  year   = {2025},
  url    = {https://github.com/alixecon/pygsadf},
}

And the original methodology:

@article{psy2015,
  title   = {Testing for Multiple Bubbles: Historical Episodes of
             Exuberance and Collapse in the {S\&P} 500},
  author  = {Phillips, Peter C.B. and Shi, Shuping and Yu, Jun},
  journal = {International Economic Review},
  volume  = {56},
  number  = {4},
  pages   = {1043--1078},
  year    = {2015},
}

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.2

Mar 19, 2026

2.0.1

Mar 19, 2026

2.0.0

Mar 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pygsadf-2.0.2.tar.gz (9.7 kB view details)

Uploaded Mar 19, 2026 Source

File details

Details for the file pygsadf-2.0.2.tar.gz.

File metadata

Download URL: pygsadf-2.0.2.tar.gz
Upload date: Mar 19, 2026
Size: 9.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pygsadf-2.0.2.tar.gz
Algorithm	Hash digest
SHA256	`d7b0cdfb883354382b7807a2d48f58c5bf43187567aee5a59fb6473e182a64ba`
MD5	`e750491c7ae456af0383797b4083e679`
BLAKE2b-256	`102c3237e63e1fd3cfaeddf79655088f15e52221441f0745cfc81299b8b4da61`

See more details on using hashes here.

pygsadf 2.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pygsadf

Why pygsadf?

Installation

Quick Start

Command Line

API Reference

`pygsadf.gsadf(y, B=1499, ...)`

`pygsadf.wild_bootstrap_cv(y, r0, B=1499, ...)`

`pygsadf.date_stamp_bubbles(bsadf, cv, dates, min_duration=5)`

How It Works

Architecture

Validation Against R `exuber`

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

pygsadf 2.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pygsadf

Why pygsadf?

Installation

Quick Start

Command Line

API Reference

pygsadf.gsadf(y, B=1499, ...)

pygsadf.wild_bootstrap_cv(y, r0, B=1499, ...)

pygsadf.date_stamp_bubbles(bsadf, cv, dates, min_duration=5)

How It Works

Architecture

Validation Against R exuber

Citation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes

`pygsadf.gsadf(y, B=1499, ...)`

`pygsadf.wild_bootstrap_cv(y, r0, B=1499, ...)`

`pygsadf.date_stamp_bubbles(bsadf, cv, dates, min_duration=5)`

Validation Against R `exuber`