Skip to main content

Stochastic volatility models fit to historical time-series data.

Project description

svolfit

This is a package that I cobbled together that fits a selection of stochastic volatility models to historical data, mainly through the use of a trinomial tree to represent the variance process. Despite that (:0) it does a quite a good job of fitting parameters. (There is a document out there somewhere that has evidence of this, but it is still a wip and I will link/include it when appropriate.)

Model parameters are produced by brute force optimization of the log-likelihood calculated using the tree, using a standard minimizer from a python package. Once the parameters are known the most likely variance path for the latent (unobserved) variance is generated by working backwards through the tree (as a Viterbi algorithm). Note that you need to use at least a few years (4-5) of daily asset observations for the resulting parameters to be reasonably converged.

There are also algorithms (to come at a later date) that estimate correlations between the asset and latent volatility and other assets -- both with and without stochastic voaltility. The idea is that one then has a complete suite of tools to estimate parameters needed to (for example) include stochastic volatility models consistently within a derivative counterparty credit risk simulation model.

Usage

The idea is to keep things very simple so that one has access to model parameters quite easily:

(pars, sdict) = svolfit( series, dt, model='Heston', method = 'grid', ... )

where:

  • series: A numpy array holding the time series for the asset that you want to fit the model to, daily observations inctreasing in time from the start to the end of the array.
  • dt: The year fraction to assign to the time between two observations (dt=1/252).
  • model: The stochastic volatility model (more below).
  • method: The approach to fitting (more below).
  • pars: The estimated model parameters in a dictionary.
  • sdict: A dictionary containing a lot of other stuff, including the most likely variance path.

Note that when you run this there may well be some noise parduced about 'divide by zero in log', and some optimizer messages. These will be cleaned up once I figure out my strategy... Also, this call is much slower than it needs to be since it is running in a single process (the gradient calculation can easily be parallelized). TODO item.

The downside for 'simple' is that this approach does not work very well (i.e., not at all) for extreme parameters. In particular:

  1. Where the mean reversion timescale is of order grid spacing or smaller, or where it is longer than the observation window supplied for calibration. This is not rerally a limitation of the model but of the data, since no approach will be able to fit for the parameter accurately. Here we simply put bounds on the mean reversion parameter (hidden and undocumented for the moment) and it is up to the user to deal with cases where it is expected to be outside the range.
  2. Where correlations become large (larger than ~80% in magnitude) then correlation estimates have been observed to be biased towards zero. This appears to be due to grid effects, resulting from the fact that the time discretization of the 'tree' matches that of the historical asset observations. The 'treeX2' method doubles the frequency of the variance grid, with the result that the bias is materially reduced--at the expense of significantly increased computational time.
  3. Where the volatility of volatility is very large the variance grid becomes very coarse, and parameter estimates can become biased and noisy; the impact of this is also partly mitigated by use of the 'treeX2' method. Currently the model limits the volatility of volatility parameter for the model to not be 'excessively large' (currently hidden and undocumented). Fits to real financial time series data suggest that most time series are well handled by the tree approach, although not all -- this will need to be documented a bit more carefully at a later time.

models:

  • 'GBM': Geometric Brownian Motion.
  • 'PureJump': Lognormal jump process, no diffusion.
  • 'MertonJD': Merton Jump Diffusion model.
  • 'HestonNandi: Heston model with perfect correlation.
  • 'Heston': The Heston moodel.
  • 'Bates': The Heston model with lognormal jumps added to the asset process.
  • 'H32': The '3/2' model, H32='Heston 3/2 model'.
  • 'B32': The 3/2 model with lognormal jumps added, 'B32' = Bates 3/2 model'.
  • 'GARCHdiff': The GARCH diffusion model.
  • 'GARCHjdiff': GARCH diffusion with lognormal jumps.

Yes, this model naming convention sucks and will likely change at some point.

methods:

  • 'analytic': currently only available for GBM.
  • 'tree': A trinomial tree that explicitly fits the initial value of the variance.
  • 'treeX2': As above, but with the frequency of timesteps for the variance tree doubled. Beware that this is SLOW.
  • 'grid': Does not explicitly fit the initial value of the variance, instead inferring it from the estimate of the most likely variance path. Currently this is the fastest, and the most stable method.
  • 'v': only defined for HestonNandi, likely to vanish over time...

Current combinations (model,method) available with status:

  • (Heston,grid): Reliable.
  • (Heston,tree): Reliable, needs cleanup and optimization.
  • (Heston,treeX2): Reliable, needs cleanup and optimization.
  • (Bates,grid): Seems correct, calibration of jump component not investigated in detail.
  • (Bates,tree): See comments above on jumps.
  • (Bates,treeX2): See comments above on jumps.
  • (H32,grid): Good shape, but not as extensively tested as Heston.
  • (B32,grid): As H32; see comments above on jumps.
  • (GARCHdiff,grid): Seems correct.
  • (GARCHjdiff,grid): Seems correct; see comments above on jumps.
  • (GBM,analytic): ML optimization, used for testing.
  • (Purejump,grid): Seems correct; see comments above on jumps.
  • (MertonJD,grid): Seems correct; see comments above on jumps.
  • (HestonNandi,v): Unreliable fit.

Also included is a utility that simulates paths from a model, estimates parameters using the simulated paths, and provides some statistics:

 estimationstats(NAME,Npaths,horizons,stride,NumProcesses, dt, model, method, modeloptions )

where:

  • NAME: Just a string that will be tacked on to results to help identify them.
  • Npaths: Number of paths to be simulated, estimated and used to calculate fit statistics.
  • horizons: A list containing the estimation horizons that the paths will be fit to.
  • stride: set equal to zero--TO BE REMOVED!
  • NumProcesses: The number of processes to use for estimation; negative/zero for a simgle thread, will use at most number_cpus()-1.
  • dt: The year fraction to assign to the time between two observations (dt=1/252).
  • model: One of the above models (may not be implemented for all).
  • method: One of the above methods (may not be implemented for all).
  • modeloptions: A dictionary with model options which must contain a dictionary of model parameters used to simulate paths, like: init={'mu': 0.05, 'sigma': 0.2} modeloptions={'init': init}

This can run for a long time dependig on the model/method, so start with something like a GBM model, with short horizons to get a feel for it before committing.

The output is some csv files with stats (to be described later) as well as gnuplot scripts and latex to help produce a simple but readable pdf of results. So run: "gnuplot *.plt" "pdflatex .tex" All results appear in the resulting pdf file. It's not intended to be pretty, but should show whether the models do a reasonable job of fitting parameters or not.

Unit Tests

Unit tests are in the tests folder (currently these are slow ~1/2 hour):

pytest

No github repo (or equivalent) at the moment, please email me directly for comments/complaints/requests, etc.

mike.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

svolfit-0.0.6.tar.gz (81.1 kB view details)

Uploaded Source

Built Distribution

svolfit-0.0.6-py3-none-any.whl (98.2 kB view details)

Uploaded Python 3

File details

Details for the file svolfit-0.0.6.tar.gz.

File metadata

  • Download URL: svolfit-0.0.6.tar.gz
  • Upload date:
  • Size: 81.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for svolfit-0.0.6.tar.gz
Algorithm Hash digest
SHA256 92c80f73f91572cef5fc340b9d6c6ceceacc1f58bd513516cc53728608ac7730
MD5 cf2e20f51b61625b1b444bd715b80cc4
BLAKE2b-256 9dbb022f893b33a96be04b446d3881b5cb48a4772e2ed729b51ed5fbefcfac29

See more details on using hashes here.

Provenance

File details

Details for the file svolfit-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: svolfit-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 98.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.6.1 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.55.1 CPython/3.8.7

File hashes

Hashes for svolfit-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 12186c7bc3379a94790370badc8ec059ab154be00e84b548849ec37ecfd40543
MD5 b8226e60741190356d112aa133de7434
BLAKE2b-256 ca06a4785e390b71f99dec0684d1e647922867d960a0649ce95f9e3bac719c9b

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page