Stochastic copula models with Ornstein-Uhlenbeck latent process

These details have not been verified by PyPI

Project links

Project description

pyscarcopula

Stochastic copula models with Ornstein-Uhlenbeck latent process in Python.

About
Install
Features
Mathematical background
Examples
Performance tuning
Architecture
License

About

pyscarcopula fits multivariate distributions using the copula approach with time-varying dependence. The classical constant-parameter model is extended to a stochastic model where the copula parameter follows an Ornstein-Uhlenbeck process.

For parameter estimation we provide five methods:

Method	Key	Description
Maximum likelihood	`mle`	Classical fit with constant copula parameter
MC p-sampler	`scar-p-ou`	Monte Carlo without importance sampling
MC m-sampler	`scar-m-ou`	Monte Carlo with efficient importance sampling (EIS)
Transfer matrix	`scar-tm-ou`	Deterministic quadrature on a grid — no Monte Carlo variance; accuracy depends on grid resolution and truncation settings
GAS	`gas`	Generalized autoregressive score (observation-driven, deterministic)

The transfer matrix method exploits the Markov structure and known Gaussian transition density of the OU process to evaluate the likelihood function as a sequence of matrix-vector products. The implementation automatically selects between dense and sparse transfer matrices depending on the kernel bandwidth, and adaptively refines the grid to resolve the transition kernel.

Install

pip install pyscarcopula

For development (includes data files and tests):

git clone https://github.com/AANovokhatskiy/pyscarcopula
cd pyscarcopula
pip install -e ".[test]"
pytest tests/

Dependencies: numpy, numba, scipy, joblib, tqdm.

Features

Copula families

Archimedean: Gumbel, Frank, Clayton, Joe (with rotations 0°/90°/180°/270°)
Elliptical: Gaussian, Student-t (MLE only)
Stochastic Student-t: d-dimensional t-copula with OU-driven degrees of freedom
Independence copula (null model for automatic vine pruning)
Equicorrelation Gaussian (single dynamic correlation for d assets)

Vine copulas

C-vine pair copula construction (fixed star structure)
R-vine pair copula construction (data-driven structure via Dissmann's MST algorithm)
Automatic copula family and rotation selection per edge (AIC/BIC)
Automatic pruning of weak edges via independence copula baseline
Tree-level and edge-level truncation for scalability
Mixed SCAR/MLE edges within a single vine

Estimation methods

MLE — constant copula parameter
SCAR-TM-OU — transfer matrix with analytical gradient (recommended)
GAS — observation-driven score model
SCAR-P-OU / SCAR-M-OU — Monte Carlo alternatives

Sampling and prediction

sample — generate synthetic data reproducing the fitted model. For SCAR models an OU trajectory is simulated with the correct time discretization; for GAS a recursive score-driven simulation is used.
predict — generate samples for next-step forecasting, conditional on the observed data. For SCAR-TM this uses mixture sampling from the posterior distribution p(x_T | data), accounting for latent-state uncertainty. For GAS it uses the last filtered value.

Diagnostics and risk

Goodness-of-fit via Rosenblatt transform + Cramér–von Mises test
Mixture Rosenblatt transform for stochastic models
Predictive time-varying copula parameter path
VaR / CVaR via pyscarcopula.contrib (rolling window, marginals, portfolio optimization)

Mathematical background

Copula models

By Sklar's theorem, any joint distribution can be decomposed as

$$F(x_1, \ldots, x_d) = C(F_1(x_1), \ldots, F_d(x_d))$$

We focus on single-parameter Archimedean copulas defined via a generator φ(t; θ):

$$C(u_1, \ldots, u_d) = \phi^{-1}(\phi(u_1; \theta) + \cdots + \phi(u_d; \theta))$$

Copula	Generator	Inverse generator	Domain
Gumbel	(-log t)^θ	exp(-t^(1/θ))	θ ∈ [1, ∞)
Frank	-log((e^(-θt) - 1)/(e^(-θ) - 1))	-(1/θ)log(1 + e^(-t)(e^(-θ) - 1))	θ ∈ (0, ∞)
Joe	-log(1 - (1-t)^θ)	1 - (1 - e^(-t))^(1/θ)	θ ∈ [1, ∞)
Clayton	(1/θ)(t^(-θ) - 1)	(1 + tθ)^(-1/θ)	θ ∈ (0, ∞)

Stochastic copula (SCAR)

In the stochastic model the copula parameter is driven by a latent Ornstein-Uhlenbeck process:

$$\theta_t = \Psi(x_t), \qquad dx_t = \theta_{\text{OU}}(\mu - x_t),dt + \nu,dW_t$$

where Ψ maps the OU state to the copula parameter domain. The likelihood function is an integral over latent paths:

$$L = \int p(x_0),\prod_t c(u_{1t}, u_{2t}; \Psi(x_t)),\prod_{t\ge 1} p(x_t \mid x_{t-1}),dx_{0:T}$$

In the current implementation, the latent process is initialized from the stationary OU distribution.

Transfer matrix method

The Markov property allows this high-dimensional integral to be factored into a chain of one-dimensional integrals, each computed as a matrix-vector product on a discretized grid. Total complexity is O(TK²) in the dense case and O(TKb) in the sparse case, where b is the effective kernel bandwidth.

The method is deterministic, but it is still a numerical approximation: accuracy depends on the grid range, adaptive grid size, and transition-kernel truncation.

Vine copulas

Vine copulas decompose a d-dimensional dependence structure into d(d-1)/2 bivariate copulas arranged in a tree hierarchy. Two types are supported:

C-vine: fixed star structure where one variable is the root of each tree.
R-vine: data-driven structure selected by Dissmann's algorithm — at each tree level a maximum spanning tree is built on |Kendall's τ|, subject to the proximity condition.

Each edge in the vine can use a different copula family and estimation method (MLE, SCAR-TM, or GAS). Weak edges can be replaced with independence copulas.

Goodness of fit

Model quality is assessed via the Rosenblatt transform. For stochastic models the implementation uses a mixture Rosenblatt transform, which integrates the h-function over the predictive distribution of the latent state, reducing the Jensen bias that appears when plugging in a point estimate.

Uniformity of the transformed sample is tested with the Cramér–von Mises statistic. As with other plug-in goodness-of-fit procedures, finite-sample calibration can vary, especially after refitting on simulated samples.

Examples

1. Read dataset

import pandas as pd
import numpy as np
from pyscarcopula._utils import pobs
from pyscarcopula import (
    GumbelCopula, CVineCopula, RVineCopula,
    GaussianCopula, StudentCopula, StochasticStudentCopula,
)
from pyscarcopula.api import fit, sample, predict, smoothed_params
from pyscarcopula.stattests import gof_test

crypto_prices = pd.read_csv("data/crypto_prices.csv", index_col=0, sep=';')
tickers = ['BTC-USD', 'ETH-USD']

returns = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:].values
u = pobs(returns)

2. Fit a bivariate copula

copula = GumbelCopula(rotate=180)

result_mle = fit(copula, u, method='mle')
result_tm  = fit(copula, u, method='scar-tm-ou')
result_gas = fit(copula, u, method='gas')

print(f"MLE:     logL={result_mle.log_likelihood:.2f}, r={result_mle.copula_param:.4f}")
print(f"SCAR-TM: logL={result_tm.log_likelihood:.2f}, theta={result_tm.params.theta:.2f}")
print(f"GAS:     logL={result_gas.log_likelihood:.2f}, beta={result_gas.beta:.4f}")

# GoF test
gof = gof_test(copula, u, fit_result=result_tm, to_pobs=False)
print(f"GoF p-value: {gof.pvalue:.4f}")

Results on daily BTC-ETH data (T = 1460):

Model	logL	GoF p-value
MLE	955.63	0.0087
GAS	1031.42	0.5282
SCAR-TM	1042.47	0.6201

These numbers are dataset-specific and provided as an illustration, not as a benchmark guarantee.

3. Predictive copula parameter path

r_t = smoothed_params(copula, u, result_tm)
# r_t[k] = E[Psi(x_k) | u_{1:k-1}]

This quantity is predictive: it uses information up to time k-1, not a full smoother using all observations.

4. Sampling and prediction

The API provides two distinct sampling functions:

sample generates synthetic data that reproduces the fitted model. Useful for validation: fit(copula, sample(...)) should recover similar parameters on average.
predict generates samples for next-step forecasting, conditional on the observed data. Used for risk metrics.

from pyscarcopula.api import sample, predict

# Model validation: sample -> refit -> compare parameters
v = sample(copula, u, result_tm, n=2000)
result_refit = fit(copula, pobs(v), method='scar-tm-ou')

# GoF on sampled data: should not show systematic rejection across repeated runs
gof_v = gof_test(copula, pobs(v), fit_result=result_refit, to_pobs=False)
print(f"GoF on sample: p={gof_v.pvalue:.4f}")

# Prediction: conditional on current market state
u_pred = predict(copula, u, result_tm, n=100_000)

How sample and predict work for each method:

Method	`sample(n)`	`predict(n)`
MLE	n observations with constant r	same as sample
SCAR-TM	OU trajectory r(t) → copula.sample(n, r)	mixture sampling from posterior `p(x_T \| data)`
GAS	recursive simulation: sample → score → update f	copula.sample(n, r=Ψ(f_T))
SCAR-MC	OU trajectory (same as SCAR-TM)	sample from stationary OU

For SCAR-TM, predict accounts for latent-state uncertainty by sampling the copula parameter from the posterior distribution over the latent state rather than using a point estimate.

5. Fit a vine copula

C-vine (fixed star structure):

tickers = ['BTC-USD', 'ETH-USD', 'BNB-USD', 'ADA-USD', 'XRP-USD', 'DOGE-USD']
returns_6d = np.log(crypto_prices[tickers] / crypto_prices[tickers].shift(1))[1:251].values
u6 = pobs(returns_6d)

vine = CVineCopula()
vine.fit(u6, method='scar-tm-ou',
         truncation_level=2, min_edge_logL=10)
vine.summary()

R-vine (data-driven structure via Dissmann's MST algorithm):

vine = RVineCopula()
vine.fit(u6, method='scar-tm-ou',
         truncation_level=2, min_edge_logL=10)
vine.summary()

Results on 6-dimensional crypto data (T = 250):

Model	logL	GoF p-value
C-vine SCAR-TM	924.8	0.63
R-vine SCAR-TM	919.1	0.62
R-vine MLE	873.0	0.19
C-vine MLE	869.2	0.21
Student-t	764.4	0.00

These numbers are dataset-specific and seed-dependent.

Sampling and prediction are implemented for both vine types:

# Model validation
v6 = vine.sample(2000)
gof_v6 = gof_test(vine, pobs(v6), to_pobs=False)

# Prediction (for risk metrics)
u_pred_6d = vine.predict(100_000, u=u6)

6. Stochastic Student-t copula

A d-dimensional t-copula where the degrees-of-freedom parameter follows an OU process:

cop = StochasticStudentCopula(d=6)
cop.fit(returns_6d, method='scar-tm-ou', to_pobs=True)

# Predictive df(t) path
df_t = cop.smoothed_params()

# Predict with time-varying df
pred = cop.predict(10000)

# GoF
gof = gof_test(cop, returns_6d, to_pobs=True)

7. Risk metrics (VaR / CVaR)

Rolling VaR and CVaR estimation with copula models. Supports bivariate, vine, and elliptical copulas (Gaussian, Student-t).

from pyscarcopula.contrib.risk_metrics import risk_metrics

result = risk_metrics(
    GumbelCopula(rotate=180),
    returns_6d, window_len=100,
    gamma=[0.95], N_mc=[100_000],
    marginals_method='johnsonsu',
    method='mle',
    optimize_portfolio=False,
    portfolio_weight=np.ones(6) / 6,
    n_jobs=-1,
)

var = result[0.95][100_000]['var']
cvar = result[0.95][100_000]['cvar']

For elliptical copulas:

result = risk_metrics(
    GaussianCopula(), returns_6d, window_len=100,
    gamma=[0.95], N_mc=[100_000],
    marginals_method='johnsonsu',
    method='mle',  # ignored for Gaussian/Student (uses its own MLE)
)

See example_new_api.ipynb for a complete walkthrough with plots.

Performance tuning

Bivariate copula

copula = GumbelCopula(rotate=180)
result = fit(copula, u, method='scar-tm-ou')

# Relaxed tolerance (faster, slight logL loss)
result = fit(copula, u, method='scar-tm-ou', tol=5e-2)

Parameter	Default	Effect
`analytical_grad`	`True`	Analytical gradient. ~3–4x fewer function evaluations.
`smart_init`	`True`	Heuristic initial point. Up to 5x speedup on long series.
`tol`	`1e-2`	Gradient tolerance. `5e-2` is often faster with small logL loss.
`K`	`300`	Minimum grid size. The adaptive rule may increase it.
`pts_per_sigma`	`2`	Grid density: points per conditional standard deviation. Higher values improve quadrature accuracy at the cost of larger K.

Vine copula

vine = RVineCopula()
vine.fit(u, method='scar-tm-ou', truncation_level=2, min_edge_logL=10)

Parameter	Default	Effect
`truncation_level`	`None`	Trees at or above this level are not fitted with dynamic SCAR updates.
`truncation_fill`	`'mle'`	Policy for edges above `truncation_level`: static MLE selection or forced independence.
`min_edge_logL`	`None`	Edges with low MLE log-likelihood can be kept static rather than upgraded to a dynamic model.

Sampling performance

For vine copulas with GAS edges, sample uses step-by-step simulation, which is slower than the more vectorized paths used for MLE and SCAR models.

Architecture

The codebase is organized in layers with top-down dependencies:

Layer	Directory	Responsibility
API	`api.py`	Entry points: `fit()`, `sample()`, `predict()`, `smoothed_params()`, `mixture_h()`
Strategy	`strategy/`	Estimation methods: MLE, SCAR-TM, GAS. Each implements `fit`, `sample`, `predict`.
Copula	`copula/`	Pure math: PDF, h-functions, transforms, base sampling
Vine	`vine/`	Vine copula models: C-vine, R-vine, edge/selection/structure utilities
Numerical	`numerical/`	TM grid, gradient, MC samplers, GAS filter, OU kernels
Types	`_types.py`, `_utils.py`	Typed results, config, shared utilities
Contrib	`contrib/`	Risk metrics, marginal distributions

All functions in api.py are stateless: they accept a copula object, data, and a result, and return new values without mutation. Convenience methods on copula classes (copula.fit(), copula.predict()) delegate to the API internally and store state for chained calls.

See ARCHITECTURE.md for the full module map.

License

MIT License. See LICENSE.txt.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.12.1

May 9, 2026

0.12.0

May 8, 2026

0.11.0

May 4, 2026

0.10.1

May 1, 2026

0.10.0

May 1, 2026

0.9.1

Apr 25, 2026

0.9.0

Apr 25, 2026

0.8.1

Apr 23, 2026

0.8.0

Apr 21, 2026

0.6.1

Apr 15, 2026

This version

0.6.0

Apr 15, 2026

0.5.2

Apr 9, 2026

0.5.1

Apr 5, 2026

0.4.2

Mar 31, 2026

0.4.1

Mar 30, 2026

0.4.0

Mar 30, 2026

0.3.1

Mar 28, 2026

0.3.0

Mar 27, 2026

0.2.0

Mar 22, 2026

0.1.1

Mar 22, 2026

0.1.0

Mar 22, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyscarcopula-0.6.0.tar.gz (4.7 MB view details)

Uploaded Apr 15, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pyscarcopula-0.6.0-py3-none-any.whl (129.0 kB view details)

Uploaded Apr 15, 2026 Python 3

File details

Details for the file pyscarcopula-0.6.0.tar.gz.

File metadata

Download URL: pyscarcopula-0.6.0.tar.gz
Upload date: Apr 15, 2026
Size: 4.7 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for pyscarcopula-0.6.0.tar.gz
Algorithm	Hash digest
SHA256	`16ba16e14f77c9f962cf941b6bce98801aedef12c1835a824edd382b9e73b1fc`
MD5	`28255156b11fc76a552bafdc7c1e48e1`
BLAKE2b-256	`8597b5aa01fd91f5a6d1c24dc3e234e4b1871515d7d975dba002f246d122e429`

See more details on using hashes here.

File details

Details for the file pyscarcopula-0.6.0-py3-none-any.whl.

File metadata

Download URL: pyscarcopula-0.6.0-py3-none-any.whl
Upload date: Apr 15, 2026
Size: 129.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for pyscarcopula-0.6.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a74ac43d44a3e04309d9e62bd5957d6ae8981fece50c6bdea4ec8418e11a9428`
MD5	`ec85e2cc1e9deaa2cd87b13508e0d312`
BLAKE2b-256	`390526de9aebc0f7e9f49d206ba475998636e4c68e3103472ec05620ab529ea8`

See more details on using hashes here.

pyscarcopula 0.6.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pyscarcopula

About

Install

Features

Mathematical background

Copula models

Stochastic copula (SCAR)

Transfer matrix method

Vine copulas

Goodness of fit

Examples

1. Read dataset

2. Fit a bivariate copula

3. Predictive copula parameter path

4. Sampling and prediction

5. Fit a vine copula

6. Stochastic Student-t copula

7. Risk metrics (VaR / CVaR)

Performance tuning

Bivariate copula

Vine copula

Sampling performance

Architecture

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes