Deflated Sharpe Ratio and statistical gates for quantitative strategy validation
Project description
deflated-sharpe
You tested 1,000 parameter combinations and found a Sharpe 2.0 strategy. Is it real — or did you just test enough times to get lucky?
deflated-sharpe implements the Deflated Sharpe Ratio (Bailey & Lopez de Prado, 2014) and related statistical gates. Pure Python, zero required dependencies, designed to be the last check before you deploy a strategy.
Install
pip install deflated-sharpe
Quick Start
Deflated Sharpe Ratio
from deflated_sharpe import deflated_sharpe_ratio
dsr, p_value = deflated_sharpe_ratio(observed_sr=2.0, num_trials=1000, num_obs=252)
print(f"DSR={dsr:.2f}, p={p_value:.4f}")
A positive DSR with p < 0.05 means your strategy likely has real alpha after accounting for the number of trials you ran. A negative DSR means the expected maximum Sharpe from random chance alone exceeds your observed Sharpe.
Minimum Backtest Length
from deflated_sharpe import min_backtest_length
min_obs = min_backtest_length(target_sr=1.5, num_trials=500)
print(f"Need at least {min_obs} observations")
Before running a large search, compute how many observations you need. If your backtest window is shorter than min_obs, no strategy can pass the DSR gate regardless of performance.
Regime Decay Detection
from deflated_sharpe import RegimeDecayDetector, StrategyBaseline, TradeResult
baseline = StrategyBaseline(win_rate=0.55, trade_count=200, max_drawdown_pct=12.0)
detector = RegimeDecayDetector(baseline=baseline)
detector.fit_market_baseline(training_features) # list of (atr_ratio, trend_pct, atr_percentile)
for trade in live_trades:
detector.add_trade(trade)
assessment = detector.assess()
print(f"Decay confirmed: {assessment.decay_confirmed} ({assessment.signals_fired}/3 signals)")
Triple-confirmation system: Bayesian win rate decay, drawdown exceedance (1.5x backtest MDD), and Mahalanobis out-of-distribution detection. Two of three signals must fire simultaneously to confirm decay.
How DSR Saved Us
In March 2026, we ran a grid search over 19,200 parameter combinations on BTCUSDT 1H walk-forward data (23 periods, 3-month IS + 3-month OOS). Multiple strategies showed Sharpe ratios above 1.5 in-sample. The DSR gate rejected every single one — correctly preventing deployment of overfitted strategies.
The math was simple: with M=19,200 trials and only 30-50 trades per window, the expected maximum Sharpe from pure chance exceeded every observed value. We then tested LLM-guided search (M~30 per period, 640x fewer trials) and found the same result: trade count was the binding constraint, not search method. DSR saved us from deploying strategies that looked profitable but had zero statistical significance.
Full analysis: Phase 15 Case Study
Tools
deflated_sharpe_ratio(observed_sr, num_trials, num_obs, skewness, kurtosis)
Computes the Deflated Sharpe Ratio per Bailey & Lopez de Prado (2014). Adjusts the observed Sharpe for selection bias from multiple testing, accounting for return non-normality via skewness and kurtosis corrections.
The key insight: when you test M strategies, the maximum Sharpe you expect from pure luck
grows as O(sqrt(ln(M))). DSR subtracts this expected maximum from your observed Sharpe
and normalizes by the standard error.
| Parameter | Type | Default | Description |
|---|---|---|---|
observed_sr |
float | required | Observed Sharpe ratio |
num_trials |
int | required | Number of strategies tested (M) |
num_obs |
int | required | Number of observations (T) |
skewness |
float | 0.0 | Return skewness (0 = normal) |
kurtosis |
float | 3.0 | Return kurtosis (3 = normal) |
Returns (dsr, p_value). DSR > 0 with p < 0.05 indicates statistical significance.
Reference: Bailey, D.H. & Lopez de Prado, M. (2014), Eq. 2-4.
min_backtest_length(target_sr, num_trials, alpha, skewness, kurtosis)
Binary search for the minimum number of observations T such that DSR > 0 at the given significance level. Use this to determine if your backtest window is long enough before running a parameter search.
from deflated_sharpe import min_backtest_length
# "I want Sharpe 1.5 after testing 200 strategies. How much data do I need?"
min_obs = min_backtest_length(target_sr=1.5, num_trials=200, alpha=0.05)
benjamini_hochberg(p_values, alpha)
Benjamini-Hochberg FDR correction for evaluating multiple strategies simultaneously. When you have N candidate strategies each with a DSR p-value, BH controls the false discovery rate at level alpha.
from deflated_sharpe import deflated_sharpe_ratio, benjamini_hochberg
p_values = [
deflated_sharpe_ratio(sr, num_trials=50, num_obs=500)[1]
for sr in [1.2, 0.8, 1.5, 0.3]
]
results = benjamini_hochberg(p_values, alpha=0.05)
for idx, p, sig in results:
print(f"Strategy {idx}: p={p:.4f}, significant={sig}")
RegimeDecayDetector
Live monitoring for strategy regime decay. Three independent signals with 2/3 majority vote:
- S1 Win Rate Decay: Bayesian Beta updating with backtest prior. Fires when P(win_rate < breakeven) exceeds threshold.
- S2 Drawdown Exceedance: Fires when current drawdown exceeds
dd_multiplier(default 1.5x) times backtest maximum drawdown. - S3 Market OOD: Mahalanobis distance on market features (ATR ratio, trend, ATR percentile). Fires when recent trades are beyond the training distribution's 95th percentile.
Anti-false-positive measures: minimum 20 trades before assessment, cooling period of 5 trades after trigger, Bonferroni correction for multiple strategies.
| Config Parameter | Default | Description |
|---|---|---|
min_trades |
20 | Minimum trades before assessment |
cooling_period |
5 | Trades to skip after trigger |
win_rate_decay_prob_threshold |
0.80 | P(wr < breakeven) threshold |
dd_multiplier |
1.5 | Drawdown exceedance multiplier |
ood_percentile |
95.0 | Mahalanobis percentile threshold |
num_strategies |
1 | N for Bonferroni correction |
Paper Verification
The DSR implementation is verified against the original paper's mathematics:
Gumbel approximation for E[Z_max], standard error with non-normality correction,
and the full DSR test statistic. See tests/test_paper_verification.py
for numerical checks against known values from Bailey & Lopez de Prado (2014).
Zero Dependencies
The core library uses only Python standard library (math, dataclasses).
The _math.py module implements norm_cdf, matrix inversion, and Mahalanobis distance
from scratch to avoid pulling in NumPy/SciPy for basic usage.
For the regime detector's Bayesian win rate signal (S1), scipy.stats.beta is used if
available; otherwise a point-estimate fallback is used. Install the optional dependency:
pip install "deflated-sharpe[scipy]"
References
Bailey, D. H., & Lopez de Prado, M. (2014). "The Deflated Sharpe Ratio: Correcting for Selection Bias, Backtest Overfitting, and Non-Normality." Journal of Portfolio Management, 40(5), 94-107. DOI: 10.3905/jpm.2014.40.5.094
License
Apache-2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file deflated_sharpe-0.1.0.tar.gz.
File metadata
- Download URL: deflated_sharpe-0.1.0.tar.gz
- Upload date:
- Size: 71.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4da61e76e5acd4222bf10e18c682423d983a6c204cb2ff6fdecce64d7d1c12d9
|
|
| MD5 |
1d5c74aa99dee8d0c72cc366f99ff67b
|
|
| BLAKE2b-256 |
900610eb6cd87bf5ec61ee691fa4f8e9b10ced4a2545f45721af3f1b68d2f772
|
Provenance
The following attestation bundles were made for deflated_sharpe-0.1.0.tar.gz:
Publisher:
publish.yml on mnemox-ai/deflated-sharpe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deflated_sharpe-0.1.0.tar.gz -
Subject digest:
4da61e76e5acd4222bf10e18c682423d983a6c204cb2ff6fdecce64d7d1c12d9 - Sigstore transparency entry: 1154431581
- Sigstore integration time:
-
Permalink:
mnemox-ai/deflated-sharpe@2cf3b018be457e7c81a51a3dcacfa6ce47c34d60 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/mnemox-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2cf3b018be457e7c81a51a3dcacfa6ce47c34d60 -
Trigger Event:
release
-
Statement type:
File details
Details for the file deflated_sharpe-0.1.0-py3-none-any.whl.
File metadata
- Download URL: deflated_sharpe-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f6c323ea3e84c2861dd35deb5f5126ff1476b7b181fee946ebf5c99fa32a251e
|
|
| MD5 |
ba1834819986ba9ecf74927de01450ad
|
|
| BLAKE2b-256 |
bb8009bcbae3a54fe98ff0fbf15995681f834bc8fe954f3dbe52192d82a44ea0
|
Provenance
The following attestation bundles were made for deflated_sharpe-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on mnemox-ai/deflated-sharpe
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
deflated_sharpe-0.1.0-py3-none-any.whl -
Subject digest:
f6c323ea3e84c2861dd35deb5f5126ff1476b7b181fee946ebf5c99fa32a251e - Sigstore transparency entry: 1154431583
- Sigstore integration time:
-
Permalink:
mnemox-ai/deflated-sharpe@2cf3b018be457e7c81a51a3dcacfa6ce47c34d60 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/mnemox-ai
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2cf3b018be457e7c81a51a3dcacfa6ce47c34d60 -
Trigger Event:
release
-
Statement type: