Stochastic copula models with Ornstein-Uhlenbeck latent process
Project description
pyscarcopula
Stochastic copula models with Ornstein-Uhlenbeck latent process in Python.
About
pyscarcopula fits multivariate distributions using the copula approach with time-varying dependence. The classical constant-parameter model is extended to a stochastic model where the copula parameter follows an Ornstein-Uhlenbeck process.
For parameter estimation we provide five methods:
| Method | Key | Description |
|---|---|---|
| Maximum likelihood | mle |
Classical fit with constant copula parameter |
| MC p-sampler | scar-p-ou |
Monte Carlo without importance sampling |
| MC m-sampler | scar-m-ou |
Monte Carlo with efficient importance sampling (EIS) |
| Transfer matrix | scar-tm-ou |
Deterministic quadrature on a grid — no MC variance or bias |
| GAS | gas |
Generalized autoregressive score (observation-driven, deterministic) |
The transfer matrix method exploits the Markov structure and known Gaussian transition density of the OU process to evaluate the likelihood function as a sequence of matrix-vector products. The implementation automatically selects between dense and sparse transfer matrices depending on the kernel bandwidth, and adaptively refines the grid to resolve the transition kernel.
Install
pip install pyscarcopula
For development (includes data files and tests):
git clone https://github.com/AANovokhatskiy/pyscarcopula
cd pyscarcopula
pip install -e ".[test]"
pytest tests/
Dependencies: numpy, numba, scipy, joblib, tqdm.
Features
Copula families
- Archimedean: Gumbel, Frank, Clayton, Joe (with rotations 0°/90°/180°/270°)
- Elliptical: Gaussian, Student-t (MLE only)
- Independence copula (null model for automatic vine pruning)
- Equicorrelation Gaussian (single dynamic correlation for d assets)
- C-vine pair copula construction
Estimation methods
- MLE — constant copula parameter
- SCAR-TM-OU — transfer matrix with analytical gradient (recommended)
- GAS — observation-driven score model
- SCAR-P-OU / SCAR-M-OU — Monte Carlo alternatives
Vine copulas
- Automatic copula family and rotation selection per edge (AIC/BIC)
- Automatic pruning of weak edges via independence copula baseline
- Tree-level and edge-level truncation for scalability
Diagnostics and risk
- Goodness-of-fit via Rosenblatt transform + Cramér–von Mises test
- Mixture Rosenblatt transform for stochastic models
- Smoothed time-varying copula parameter
- VaR / CVaR via
pyscarcopula.contrib(rolling window, marginals, portfolio optimization)
Mathematical background
Copula models
By Sklar's theorem, any joint distribution can be decomposed as
$$F(x_1, \ldots, x_d) = C(F_1(x_1), \ldots, F_d(x_d))$$
We focus on single-parameter Archimedean copulas defined via a generator φ(t; θ):
$$C(u_1, \ldots, u_d) = \phi^{-1}(\phi(u_1; \theta) + \cdots + \phi(u_d; \theta))$$
| Copula | Generator | Inverse generator | Domain |
|---|---|---|---|
| Gumbel | (-log t)^θ | exp(-t^(1/θ)) | θ ∈ [1, ∞) |
| Frank | -log((e^(-θt) - 1)/(e^(-θ) - 1)) | -(1/θ)log(1 + e^(-t)(e^(-θ) - 1)) | θ ∈ (0, ∞) |
| Joe | -log(1 - (1-t)^θ) | 1 - (1 - e^(-t))^(1/θ) | θ ∈ [1, ∞) |
| Clayton | (1/θ)(t^(-θ) - 1) | (1 + tθ)^(-1/θ) | θ ∈ (0, ∞) |
Stochastic copula (SCAR)
In the stochastic model the copula parameter is driven by a latent Ornstein-Uhlenbeck process:
$$\theta_t = \Psi(x_t), \qquad dx_t = \theta_{\text{OU}}(\mu - x_t),dt + \nu,dW_t$$
where Ψ maps the OU state to the copula parameter domain. The likelihood function is an integral over all latent paths:
$$L = \int \prod_{t} c(u_{1t}, u_{2t}; \Psi(x_t));p(x_t | x_{t-1});dx_0 \cdots dx_T$$
Transfer matrix method
The Markov property allows this high-dimensional integral to be factored into a chain of one-dimensional integrals, each computed as a matrix-vector product on a discretized grid. Total complexity: O(TK²) (dense) or O(TKb) (sparse, where b is the kernel bandwidth).
Goodness of fit
Model quality is assessed via the Rosenblatt transform. For the stochastic model we use the mixture Rosenblatt transform, which integrates the h-function over the predictive distribution of the latent state, avoiding the Jensen bias from plugging in a point estimate. Uniformity of the transformed sample is tested with the Cramér–von Mises statistic.
Examples
1. Read dataset
import pandas as pd
import numpy as np
from pyscarcopula._utils import pobs
from pyscarcopula import GumbelCopula, CVineCopula, GaussianCopula, StudentCopula
from pyscarcopula.api import fit, smoothed_params
from pyscarcopula.stattests import gof_test
crypto_prices = pd.read_csv("data/crypto_prices.csv", index_col=0, sep=';')
tickers = ['BTC-USD', 'ETH-USD']
returns = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:].values
u = pobs(returns)
2. Fit a bivariate copula
copula = GumbelCopula(rotate=180)
result_mle = fit(copula, u, method='mle')
result_tm = fit(copula, u, method='scar-tm-ou')
result_gas = fit(copula, u, method='gas')
# Typed results — access parameters directly
print(f"MLE: logL={result_mle.log_likelihood:.2f}, r={result_mle.copula_param:.4f}")
print(f"SCAR-TM: logL={result_tm.log_likelihood:.2f}, theta={result_tm.params.theta:.2f}")
print(f"GAS: logL={result_gas.log_likelihood:.2f}, beta={result_gas.beta:.4f}")
# GoF test
gof = gof_test(copula, u, fit_result=result_tm, to_pobs=False)
print(f"GoF p-value: {gof.pvalue:.4f}")
Results on daily BTC-ETH data (T = 1460):
| Model | logL | GoF p-value |
|---|---|---|
| MLE | 955.63 | 0.0087 |
| GAS | 1031.42 | 0.5282 |
| SCAR-TM | 1042.47 | 0.6201 |
3. Smoothed copula parameter
r_t = smoothed_params(copula, u, result_tm)
# r_t[k] = E[Psi(x_k) | u_{1:k-1}]
4. Sample from copula
samples = copula.sample(n=1000, r=result_mle.copula_param)
5. Fit a multivariate C-vine copula
tickers = ['BTC-USD', 'ETH-USD', 'BNB-USD', 'ADA-USD', 'XRP-USD', 'DOGE-USD']
returns_6d = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:251].values
u6 = pobs(returns_6d)
vine = CVineCopula()
vine.fit(u6, method='scar-tm-ou',
truncation_level=2, min_edge_logL=10)
vine.summary()
gof_vine = gof_test(vine, u6, to_pobs=False)
Results on 6-dimensional crypto data (T = 250):
| Model | logL | GoF p-value |
|---|---|---|
| C-vine SCAR-TM | 921.9 | 0.90 |
| C-vine MLE | 869.2 | 0.21 |
| Student-t | 764.4 | 0.00 |
6. Risk metrics (VaR / CVaR)
from pyscarcopula.contrib.risk_metrics import risk_metrics
result = risk_metrics(
GumbelCopula(rotate=180),
returns_6d, window_len=100,
gamma=[0.95], N_mc=[100_000],
marginals_method='johnsonsu',
method='mle',
optimize_portfolio=False,
portfolio_weight=np.ones(6) / 6,
n_jobs=-1,
)
var = result[0.95][100_000]['var']
cvar = result[0.95][100_000]['cvar']
See example_new_api.ipynb for a complete walkthrough with plots.
Performance tuning
Bivariate copula
copula = GumbelCopula(rotate=180)
result = fit(copula, u, method='scar-tm-ou')
# Relaxed tolerance (faster, slight logL loss)
result = fit(copula, u, method='scar-tm-ou', tol=5e-2)
| Parameter | Default | Effect |
|---|---|---|
analytical_grad |
True |
Analytical gradient. ~3-4x fewer function evaluations. |
smart_init |
True |
Heuristic initial point. Up to 5x speedup on long series. |
tol |
1e-2 |
Gradient tolerance. 5e-2 is ~2x faster with negligible logL loss. |
K |
300 |
Minimum grid size. The adaptive rule may increase it. |
pts_per_sigma |
2 |
Grid density: points per conditional standard deviation. Higher values improve quadrature accuracy at the cost of larger K. |
Vine copula
vine = CVineCopula()
vine.fit(u, method='scar-tm-ou', truncation_level=2, min_edge_logL=10)
| Parameter | Default | Effect |
|---|---|---|
truncation_level |
None |
Trees >= this level stay MLE. Recommended: 2-3 for d > 10. |
min_edge_logL |
None |
Edges with MLE logL below threshold stay MLE. Recommended: 5-10. |
Architecture
The codebase is organized in layers with top-down dependencies:
| Layer | Directory | Responsibility |
|---|---|---|
| API | api.py |
Entry points: fit(), smoothed_params(), mixture_h() |
| Strategy | strategy/ |
Estimation methods: MLE, SCAR-TM, GAS |
| Copula | copula/ |
Pure math: PDF, h-functions, transforms |
| Numerical | numerical/ |
TM grid, gradient, MC samplers, OU kernels |
| Types | _types.py, _utils.py |
Typed results, config, shared utilities |
See ARCHITECTURE.md for the full module map.
License
MIT License. See LICENSE.txt.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pyscarcopula-0.3.1.tar.gz.
File metadata
- Download URL: pyscarcopula-0.3.1.tar.gz
- Upload date:
- Size: 4.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d85dbc765e1405f86c49b9e7e34d47c68023fe8a2598b4b6ea2b53278260e5ed
|
|
| MD5 |
f9c7a8f2ba3eaabd9ce7de71742d000b
|
|
| BLAKE2b-256 |
f70950fc22721e33ff0b2a821ce45b5852d571fdb04c329911e15f922f8ccb9d
|
File details
Details for the file pyscarcopula-0.3.1-py3-none-any.whl.
File metadata
- Download URL: pyscarcopula-0.3.1-py3-none-any.whl
- Upload date:
- Size: 83.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c9769aad20e953a11be8e093ff46f8ae225b0ed07566255e03b84d6c020f8551
|
|
| MD5 |
296b646ea754e03bc98369df90282ec2
|
|
| BLAKE2b-256 |
db481c5d67746df5d1bf1bd8ecdf5368d714faaf8a7d2073f6aefa9a1de1168d
|