Skip to main content

Rigorous validation for synthetic financial time series

Project description

finval

Rigorous validation for synthetic financial time series.

finval is a Python library for assessing the quality of synthetic market data against real data. It was built because no existing library covers the financial stylized facts that matter: fat tails, volatility clustering, leverage effect, crash co-movement, and probabilistic forecast calibration.

finval is the scoring backend behind FinBench, the public leaderboard for multivariate financial time-series generation.

⚠️ Beta. The library is in active use (57 tests, FinBench production scoring) but the public API may still evolve in minor versions. Pin to finval==0.1.x if you need API stability.

Why finval?

General-purpose synthetic data libraries (sdmetrics, synthcity, tsgm) treat time series as generic sequences. They don't know what "leverage effect" is, don't check PIT uniformity, and don't compute tail dependence coefficients. For financial applications — risk management, backtesting, derivatives — you need a suite that tests the things that actually matter for market data.

finval ships 17 metric functions producing 20 numeric scores across 5 categories (the calibration coverage_50 / 90 / 95 come from a single function), each with thresholds calibrated against real financial data and justified by the statistical literature.

Installation

pip install finval

Quickstart

import numpy as np
import finval

# 2D data: (n_samples, n_features) returns
real = np.random.randn(1000, 3) * 0.01
synthetic = np.random.randn(1000, 3) * 0.01

report = finval.validate(synthetic, real)
print(report.summary())
print(f"Overall quality: {report.overall_quality}")
print(f"Pass rate: {report.pass_rate:.0%}")

# 3D data: (n_paths, horizon, n_features) for path-level validation
real_paths = np.random.randn(100, 60, 3) * 0.01
syn_paths = np.random.randn(100, 60, 3) * 0.01

report = finval.validate_paths(syn_paths, real_paths)
print(report.summary())

Metrics

Distribution (15% of overall score)

  • marginal_ks — Kolmogorov-Smirnov test on each feature's marginal
  • energy_distance — multivariate distribution difference
  • tail_quantiles — 1st/5th/95th/99th percentile comparison (robust alternative to kurtosis)
  • tail_heaviness — excess kurtosis error (diagnostic only)

Dependence (25%)

  • pearson_corr — linear correlation matrix error
  • spearman_corr — rank correlation matrix error
  • copula_distance — Cramér-von Mises distance between empirical copulas
  • tail_dependence_upper — rally co-movement (λ_U)
  • tail_dependence_lower — crash co-movement (λ_L)
  • correlation_breakdown — stress vs calm regime correlation shift

Temporal (20%)

  • acf_returns — autocorrelation of returns (should be ~0)
  • volatility_clustering — autocorrelation of squared returns
  • leverage_effect — corr(r_t, |r_{t+k}|) (negative for equities)
  • cross_correlation — contemporaneous cross-asset correlation

Calibration (30%)

  • pit_uniformity — KS test on probability integral transforms
  • crps — continuous ranked probability score
  • coverage_50 / 90 / 95 — empirical vs nominal interval coverage

Path-level (10%)

  • drawdown_distribution — KS test on max drawdown distribution

Baselines

Compare your model against simple reference generators to calibrate what "good" means for your data:

from finval.baselines import gaussian_baseline, historical_bootstrap, block_bootstrap

# Gaussian: matches mean+cov, no temporal structure
gauss = gaussian_baseline(real, n_samples=1000)

# i.i.d. bootstrap: matches joint distribution exactly, zero temporal
boot = historical_bootstrap(real, n_samples=1000)

# Block bootstrap: preserves short-range temporal structure
blocks = block_bootstrap(real, n_paths=100, path_length=60, block_size=20)

# Validate each
for name, syn in [("gaussian", gauss), ("iid", boot)]:
    r = finval.validate(syn, real)
    print(f"{name}: {r.overall_quality} ({r.overall_score:.0%})")

Design principles

  1. Reliable over comprehensive. Each metric is chosen because it's robust and informative, not because it's impressive.

  2. Mean over max for pairwise metrics. Max over n(n-1)/2 feature pairs is dominated by sampling noise. finval uses mean error, which is harder to fool and more stable run-to-run.

  3. Lower is always better. Every metric is normalized so that zero is perfect and higher is worse. No flipped signs to remember.

  4. Financial stylized facts first. Leverage effect, vol clustering, fat tails, crash co-movement — these aren't optional for financial data.

  5. Proper scoring rules. CRPS and PIT uniformity are proper scoring rules, not just rank-order checks. Your model is evaluated against the ground truth the statistics literature actually endorses.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

finval-0.1.0.tar.gz (30.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

finval-0.1.0-py3-none-any.whl (33.3 kB view details)

Uploaded Python 3

File details

Details for the file finval-0.1.0.tar.gz.

File metadata

  • Download URL: finval-0.1.0.tar.gz
  • Upload date:
  • Size: 30.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for finval-0.1.0.tar.gz
Algorithm Hash digest
SHA256 904e080c8bb56683d22312cfdfae27f80b56c812158b59c3d6e25f2f8d066807
MD5 0fbcf0889e0192594a7c87af66c13b0f
BLAKE2b-256 a61233643d351c10f752afb4341b17fd07bf2b5aac4f36edca3da14186896555

See more details on using hashes here.

File details

Details for the file finval-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: finval-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 33.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for finval-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a7b41805f6a7d6cd8841b37f0cd4b2fd1042b1631d7b00e997af92655d4620d6
MD5 610a0c6004d9c53460c81e07357592b2
BLAKE2b-256 5603d2a1eeb053baf998afd4df3fcd1498020e342868508b086dd6a7e07c431e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page