Fast parallel GSADF bubble detection (PSY 2015) with wild bootstrap critical values
Project description
pygsadf
Fast parallel GSADF bubble detection (PSY 2015) with wild-bootstrap critical values.
The first Python package to deliver production-grade, parallelised Generalised Sup ADF testing. Detects explosive bubbles in financial time series at 90%, 95%, and 99% confidence levels.
Why pygsadf?
| Feature | pygsadf | R exuber |
Stata gsadf |
EViews |
|---|---|---|---|---|
| Parallel bootstrap | Yes (all cores) | Limited | No | No |
| CV 99% | Yes | Yes | No | No |
| Numba JIT kernel | Yes | C++ (Rcpp) | Mata | Proprietary |
| Pointwise BSADF CVs | Yes | Yes | No | No |
| CLI tool | Yes | No | No | No |
| One-line API | Yes | No | No | No |
| Free & open source | MIT | GPL | $$ | $$$ |
Performance: 1499 bootstrap iterations on T=3500 series in ~12 minutes on 192 cores, vs ~8 hours sequential.
Installation
pip install pygsadf # core (NumPy + joblib)
pip install pygsadf[fast] # + Numba JIT (10-50x faster)
pip install pygsadf[full] # + Numba + matplotlib + statsmodels + tqdm
Quick Start
import pygsadf
import pandas as pd
# Load your log-price series
prices = pd.read_csv("eth_prices.csv", index_col=0, parse_dates=True)
log_prices = prices["close"].apply(np.log)
# Run GSADF test (one line)
result = pygsadf.gsadf(log_prices)
# Results
print(result) # Full summary
result.reject_h0(0.95) # True = bubble detected at 95%
result.reject_h0(0.99) # True = bubble detected at 99%
result.bubbles # List of (start, end) episodes
result.plot() # Publication-ready figure
# Save / load
result.to_pickle("gsadf_result.pkl")
loaded = pygsadf.GSADFResult.from_pickle("gsadf_result.pkl")
Command Line
# Full run with 1499 bootstrap replications
pygsadf --csv data.csv --col log_price --B 1499 --out result.pkl --plot bsadf.png
# Quick test (B=199)
pygsadf --csv data.csv --col close --log --B 199
# Use fewer cores
pygsadf --csv data.csv --col log_price --n-jobs 8
API Reference
pygsadf.gsadf(y, B=1499, ...)
Main entry point. Accepts NumPy array or pandas Series.
Parameters:
y— Log-price series (array or pd.Series with DatetimeIndex)B— Bootstrap replications (default 1499; use 199 for quick tests)max_lag— Maximum ADF augmentation lags (BIC selects optimal)quantiles— Confidence levels, default(0.90, 0.95, 0.99)seed— RNG seed for reproducibilityn_jobs— Parallel workers (-1 = all cores)
Returns: GSADFResult with:
.gsadf_stat— Scalar GSADF statistic.cv— Dict of critical values{"90%": ..., "95%": ..., "99%": ...}.bsadf— Full BSADF sequence (ndarray).bsadf_cv— Pointwise CV sequences.bubbles— List ofBubbleEpisodeobjects.reject_h0(confidence)— Boolean hypothesis test.plot()— Matplotlib figure.summary()— Formatted text output.to_pickle() / .from_pickle()— Serialisation
pygsadf.wild_bootstrap_cv(y, r0, B=1499, ...)
Low-level bootstrap function for custom workflows.
pygsadf.date_stamp_bubbles(bsadf, cv, dates, min_duration=5)
Date-stamp explosive episodes from BSADF vs critical value sequences.
How It Works
- BSADF Computation — For each endpoint, compute the supremum of right-tailed ADF statistics over all valid start points (PSY 2015, Section 3)
- GSADF — The overall supremum of the BSADF sequence
- Wild Bootstrap — Generate synthetic unit-root series using Rademacher weights, compute GSADF on each, take empirical quantiles as critical values
- Date-Stamping — Episodes where BSADF exceeds the pointwise 95% CV for at least
log(T)consecutive days
The bootstrap is embarrassingly parallel — each replication is independent with its own deterministic RNG seed, giving identical results whether run on 1 core or 192.
Architecture
pygsadf/
├── __init__.py # Public API
├── core.py # gsadf() + GSADFResult
├── adf.py # Numba-JIT ADF kernel with BIC lag selection
├── bsadf.py # GSADF + BSADF computation
├── bootstrap.py # Parallel wild bootstrap
├── datestamp.py # Bubble episode detection
└── cli.py # Command-line interface
Citation
If you use pygsadf in academic work, please cite:
@software{pygsadf,
title = {pygsadf: Fast Parallel GSADF Bubble Detection},
author = {Madkhali, Ali},
year = {2025},
url = {https://github.com/alixecon/pygsadf},
}
And the original methodology:
@article{psy2015,
title = {Testing for Multiple Bubbles: Historical Episodes of
Exuberance and Collapse in the {S\&P} 500},
author = {Phillips, Peter C.B. and Shi, Shuping and Yu, Jun},
journal = {International Economic Review},
volume = {56},
number = {4},
pages = {1043--1078},
year = {2015},
}
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file pygsadf-2.0.1.tar.gz.
File metadata
- Download URL: pygsadf-2.0.1.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6908e1e3e6fb8b12bb27788bcecf09fe197c877fa795ee7cb4c9b01b3756495
|
|
| MD5 |
46c3f32e1b915ad5e2a2f214a87c88b6
|
|
| BLAKE2b-256 |
087facb57abe48e8a4d20fec1a6078f336441b99817f1d323e5430e76759729b
|