Skip to main content

Time-causal financial generative models: refactored TC-VAE baselines with causal VQ/RVQ tokenizers, token priors, S&P500/VIX, Hawkes/SVMHJD, multi-dimensional benchmarks, and path-risk diagnostics.

Project description

TimeCausalVAE

PyPI License: GPL-3.0 Ruff pre-commit Python

Time-causal financial generative models: refactored TC-VAE baselines with causal VQ/RVQ tokenizers, token priors, S&P500/VIX, Hawkes/SVMHJD, multi-dimensional benchmarks, and path-risk diagnostics.

time-causal-vae is a research package for time-causal financial generative modelling across synthetic and empirical market time series.

The Python distribution is time-causal-vae; the import package is time_causal_vae. The GitHub repository remains TimeCausalVQVAE because it also hosts the discrete VQ extension work.

Release notes: 0.1.1.

Discrete time-causal VQ-VAE architecture

Discrete time-causal VQ-VAE architecture. The diagram shows the S&P 500/VIX input window, causal convolutional encoder and decoder stacks, vector quantization, the VIX conditioning branch, and the receptive-field structure used to preserve no-anticipation behaviour.

Installation

Install the package from PyPI:

pip install time-causal-vae

Wheel installs include the runtime package only. From a source checkout, use Poetry groups for development tools, local empirical data access, notebooks, and optional tracking:

poetry install --only main
poetry install --with dev
poetry install --with notebooks
poetry install --with data
poetry install --with tracking

The docs URL currently points to the repository documentation directory. No hosted Sphinx documentation is published yet.

Quickstart

Check the installed package:

python - <<'PY'
import time_causal_vae

print(time_causal_vae.__version__)
PY

Inspect installed command-line entry points:

tcvae-train --help
tcvae-train-tokenizer --help
tcvae-train-token-prior --help
tcvae-evaluate --help
tcvae-select-model --help

Repository examples use configs, scripts, and registry files from the source tree. Clone the repository when running the public workflows:

git clone https://github.com/GVourvachakis/TimeCausalVQVAE.git
cd TimeCausalVQVAE
poetry install --with dev,data

Inspect the public S&P500/VIX registry entry:

poetry run python scripts/select_registered_model.py \
  --experiment sp500_vix \
  --family discrete

Run a dry-run continuous S&P500/VIX smoke command:

poetry run tcvae-train \
  --config configs/experiments/sp500_vix_beta_cvae.yaml \
  --output-dir outputs/sp500_vix_continuous \
  --epochs 1 \
  --no-wandb \
  --dry-run

Remove --dry-run only when you intentionally want to train locally.

Public Status

S&P500/VIX is the stable public default one-dimensional workflow. Hawkes/SVMHJD is an optional research benchmark with research-candidate metadata. Multidimensional benchmarks are experimental infrastructure, and no multidimensional model is selected in trained_models/model_registry.yaml. Experimental multidimensional profile metadata is kept in trained_models/multidim_profiles.yaml.

No downloaded data, trained weights, checkpoints, token tensors, generated paths, W&B runs, notebooks with outputs, or local result summaries are shipped with the package.

Stable Benchmarks

Benchmark Role Public status
S&P500/VIX Empirical one-dimensional market workflow with VIX conditioning and a local processed data convention. Public default. Uses local-only processed data and selected continuous/discrete registry metadata.
Black-Scholes Synthetic geometric Brownian motion baseline for smoke tests and one-dimensional generation checks. Stable baseline config and registry metadata.
Heston Synthetic stochastic-volatility baseline with a latent variance channel. Stable baseline config and registry metadata.
Path-dependent volatility Conditional synthetic volatility baseline with a prefix volatility feature. Stable baseline config and registry metadata.

The selected public S&P500/VIX discrete baseline is a standard causal VQ tokenizer plus an additive scalar-conditioned causal autoregressive token prior:

configs/experiments/sp500_vix_causal_vq_tokenizer.yaml
configs/experiments/sp500_vix_causal_token_prior_additive.yaml

Optional Research Benchmark

Benchmark Description Public status
Hawkes/SVMHJD Marked Hawkes jump-diffusion benchmark with Ogata event simulation and fixed-grid observation. Optional rare-event research benchmark with public_default: false. No weights or generated outputs are committed.

Experimental Benchmarks

Benchmark Description Public status
Multifactor market 50-dimensional low-rank factor market with sector structure and optional common/sector jumps. Experimental infrastructure for shape, covariance, and no-leakage checks.
S&P500 50-stock panel Local-only yfinance/Yahoo-backed daily 50-stock equity panel. Experimental infrastructure. Downloaded Yahoo-backed data must remain local and is not redistributed.

The benchmark notes live under docs/benchmarks. They document the synthetic SDE or simulator specification, empirical data source conventions, tensor and condition layouts, preprocessing rules, and local-data boundaries for each workflow.

Benchmark Data Conventions

Benchmark Data convention
S&P500/VIX Local processed benchmark data is expected at data/processed/sp500vix/sp500vix_normalized.npy.
Hawkes/SVMHJD Synthetic paths are generated locally from the marked Hawkes jump-diffusion simulator.
Multifactor market Synthetic 50D panels are generated locally from the low-rank sector-factor simulator.
S&P500 50-stock panel Daily panels are downloaded locally through optional yfinance access and must not be redistributed.

Models And Features

Area Included Release status
Continuous TC-VAE No-anticipation continuous VAE baseline, RealNVP-compatible prior paths, and financial dataset conventions. Stable baseline surface.
Causal VQ tokenizers Causal convolutional tokenizers with vector-quantized latent codes. Public S&P500/VIX discrete baseline.
RVQ and multi-code tokenizers Residual and multi-code tokenizer infrastructure. Experimental. No multidimensional model is registry-selected.
Token priors Additive autoregressive priors and causal conv-transformer research variants. Additive prior is the public S&P500/VIX default; conv-transformer variants are research candidates.
Registry metadata Selected configs, local checkpoint conventions, metrics, caveats, and no-leakage status. Metadata only. It does not contain weights.
Notebook demos Output-stripped notebooks that print guarded commands and read local outputs when available. Demonstration only. They should not train or evaluate by default.

Executed notebook previews are available on the docs/executed-notebook-previews branch. The main branch keeps notebooks output-stripped for reproducibility and package size. Preview outputs depend on local artefacts and checkpoints and are not the package source of truth.

Diagnostics

Diagnostic family Examples Notes
Distributional distances MMD, sliced Wasserstein, terminal and volatility Wasserstein distances. Used for registry summaries and model comparison.
Path-risk summaries Drawdown, return autocorrelation, squared-return autocorrelation, VaR, and ES. Intended for generated-vs-real path checks, not investment advice.
Conditional checks VIX-bucket summaries and prefix-safe condition handling. Used by the public S&P500/VIX workflow.
Token diagnostics Codebook usage, active codes, token perplexity, transition summaries, and latent geometry. Used to inspect discrete-token behaviour.
Jump diagnostics Jump count, inter-arrival, jump-size, and lower-tail summaries. Used by the optional Hawkes/SVMHJD benchmark.
Cross-sectional checks Covariance, correlation, eigenspectrum, sector-block, and portfolio-risk summaries. Experimental multidimensional infrastructure.

Local Data Policy

The package does not redistribute empirical market data. The S&P500/VIX data file is expected locally at:

data/processed/sp500vix/sp500vix_normalized.npy

The S&P500 50-stock panel downloader uses optional yfinance access and writes local raw and processed files under data/raw/ and data/processed/. Yahoo-backed data is subject to Yahoo's terms and must not be redistributed or committed.

Generated artefacts belong under local paths such as outputs/, wandb/, or data/processed/. They are intentionally excluded from the public repository and package.

Repository Layout

Path Purpose
src/time_causal_vae Importable package source.
configs/experiments Repository workflow configs used by scripts and notebooks.
scripts Inspection, extraction, evaluation, no-leakage, and smoke helpers.
trained_models Lightweight registry metadata and model cards only.
docs/benchmarks Public benchmark notes.
assets/figures Small curated README figures generated from local runs.
notebooks Output-stripped demos and report-facing notebooks.

Background

TimeCausalVAE keeps the no-anticipation contract from upstream TC-VAE: at time t, encoders, tokenizers, priors, and diagnostics should only use observations and conditions available up to that point. The public branch preserves the continuous TC-VAE baseline and adds a discrete two-stage path: causal tokenizer first, causal token prior second.

The package is research software for generative modelling diagnostics. It is not a calibrated pricing library, a trading system, or a source of financial advice.

Citation And Acknowledgement

This repository refactors selected parts of the original Time-Causal VAE code and extends the public workflow with causal VQ-style discrete latent models. Please cite or acknowledge the relevant upstream work when using the package:

License

This project is released under the GNU General Public License v3. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

time_causal_vae-0.1.1.tar.gz (266.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

time_causal_vae-0.1.1-py3-none-any.whl (354.8 kB view details)

Uploaded Python 3

File details

Details for the file time_causal_vae-0.1.1.tar.gz.

File metadata

  • Download URL: time_causal_vae-0.1.1.tar.gz
  • Upload date:
  • Size: 266.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.2.0-39-generic

File hashes

Hashes for time_causal_vae-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b89cbb92239266851269964ac3281557d553b89887b1c4ac6bc01ac1e121ea4d
MD5 6b05c3fdfdee8fed8592a1b9981d0b91
BLAKE2b-256 b753467714e6a66f18a79a23809e771f7ca5b0cdc4a84ea298a48269501119f1

See more details on using hashes here.

File details

Details for the file time_causal_vae-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: time_causal_vae-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 354.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.2.0-39-generic

File hashes

Hashes for time_causal_vae-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1f4b9b1f5887fcde22e687197a6580c1ed4476b0545135798d251973f2f7a2d4
MD5 bca95699caf1005eb863c857c9755a6c
BLAKE2b-256 852e98b2da9c37e9a693ad4660a8b82e912fb79725ff0ddbecaeff0e4a2c8d02

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page