Simulate realistic financial markets from historical price data
Project description
🔥 pathforge
Simulate realistic financial markets from historical price data — for strategy testing, research, and risk analysis.
Why pathforge?
Testing a trading strategy on a single historical price series tells you how it performed on one specific path the market happened to take. That's not enough. A robust strategy should work across the full range of outcomes the market could have produced.
pathforge learns the statistical behaviour of any asset from its historical prices and generates hundreds of realistic alternative price paths. Test your strategy across all of them and you'll know how robust it really is.
Installation
pip install pathforge
pip install numba # required for markov_egarch model
To use the built-in plot functionality:
pip install pathforge[examples]
Quick Start
import pathforge as pf
import yfinance as yf
# Download historical price data
ticker = yf.Ticker("AAPL")
prices = ticker.history(period="5y")["Close"]
# Create a forge and fit a model
forge = pf.PathForge(prices)
forge.fit(model="garch")
# Simulate one year of trading days across 100 paths
sim = forge.simulate(days=252, n_paths=100, seed=42)
# Explore the results
sim.summary()
sim.plot()
# Get the paths as a DataFrame for your own analysis
df = sim.to_dataframe() # shape: (253, 100)
Models
| Model | model= |
Best for |
|---|---|---|
| Markov-switching EGARCH | "markov_egarch" |
Research-grade: hidden regimes + volatility clustering + fat tails |
| Geometric Brownian Motion | "gbm" |
Fast baseline, simple assumptions |
| GARCH(1,1) | "garch" |
Realistic volatility clustering |
| Block Bootstrap | "bootstrap" |
Non-parametric, no distributional assumptions |
| Merton Jump Diffusion | "jump_diffusion" |
Capturing sudden crashes and spikes |
Which model should I use?
- GBM — good sanity check, fast, but underestimates tail risk
- GARCH — best for most use cases, captures the volatility clustering seen in real markets
- Bootstrap — most honest for strategy testing, resamples real historical behaviour directly
- Jump Diffusion — best when your data contains sudden large moves you want to preserve
- Markov-switching EGARCH — the most sophisticated model. Identifies hidden market regimes (calm, stressed, crisis) each with its own EGARCH volatility dynamics and Student-t innovations. Captures regime persistence, volatility clustering, leverage effects, and fat tails simultaneously. Requires minimum 2 years of daily data and Numba for speed optimisation.
Usage Notes
Markov-switching EGARCH
The markov_egarch model has specific requirements and options:
- Minimum data: 2 years of daily prices (500+ observations recommended)
- Fitting time: ~1 minute on a modern machine (first call longer due to Numba JIT warmup)
- Dependencies: requires
numba—pip install numba
forge = pf.PathForge(prices)
forge.fit(
model="markov_egarch",
n_states=3, # number of hidden regimes
n_starts=3, # random restarts for EM algorithm
verbose=True, # print fitting progress
random_state=42, # for reproducibility
min_persistence=0.7 # minimum regime persistence (set to None to disable)
)
sim = forge.simulate(days=252, n_paths=100)
Note: This model uses a generalised EM algorithm rather than an exact closed-form M-step. Volatility dynamics are modelled using state-specific, uncentred EGARCH filters, resulting in an approximate likelihood. This approach is designed for practical simulation and backtesting rather than exact state-space inference. See the GitHub repository for full technical details.
API Reference
PathForge(data)
The main class. Pass a pd.Series or pd.DataFrame of daily closing prices.
| Method | Description |
|---|---|
.fit(model="garch") |
Fit a simulation model to the historical data |
.simulate(days=252, n_paths=100, start_price=None, seed=None) |
Generate simulated price paths |
SimulationResult
Returned by .simulate().
| Attribute / Method | Description |
|---|---|
.paths |
np.ndarray of shape (days+1, n_paths) |
.to_dataframe() |
Paths as a pd.DataFrame, one column per path |
.summary() |
Print statistical summary of the simulation |
.plot(max_paths=50) |
Plot simulated paths with historical context |
Roadmap
- Merton Jump Diffusion
- Markov-switching EGARCH with Student-t innovations
- Intraday timeframes (1m, 5m, 15m, 1h)
- Multi-asset correlated simulation
- Centred EGARCH specification
- CLI:
pathforge simulate AAPL --days 252 --paths 500
Contributing
PRs and issues welcome at github.com/franmanz/pathforge.
License
MIT © 2026 franmanz
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pathforge-0.2.0.tar.gz.
File metadata
- Download URL: pathforge-0.2.0.tar.gz
- Upload date:
- Size: 15.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dab759b0e115ac382cc704838574b03492b3050c87375d80476ce122deb68116
|
|
| MD5 |
32f23b0f1f78f3b64c9041555698a313
|
|
| BLAKE2b-256 |
fcda13915d379dd1f278857a8c93f4af9b97ee9b13b58554276598073135c9e7
|
File details
Details for the file pathforge-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pathforge-0.2.0-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.15
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0c71f26a1260ebdf37ed3c396fddef1b15e6df19e7726516787a71b13d81149d
|
|
| MD5 |
b6d995899af6dea8af13efbe9d1e09ed
|
|
| BLAKE2b-256 |
3e1c6ff3a65ff2374a4cb5b2efc68b0207f6cd10d4324201b093e0249310413c
|