Time-causal financial generative models: refactored TC-VAE baselines with causal VQ/RVQ tokenizers, token priors, S&P500/VIX, Hawkes/SVMHJD, multi-dimensional benchmarks, and path-risk diagnostics.
Project description
TimeCausalVAE
Time-causal financial generative models: refactored TC-VAE baselines with causal VQ/RVQ tokenizers, token priors, S&P500/VIX, Hawkes/SVMHJD, multi-dimensional benchmarks, and path-risk diagnostics.
time-causal-vae is a research package for time-causal financial generative
modelling across synthetic and empirical market time series.
The Python distribution is time-causal-vae; the import package is
time_causal_vae. The GitHub repository remains TimeCausalVQVAE because it
also hosts the discrete VQ extension work.
Release notes: 0.1.1.
Discrete time-causal VQ-VAE architecture. The diagram shows the S&P 500/VIX input window, causal convolutional encoder and decoder stacks, vector quantization, the VIX conditioning branch, and the receptive-field structure used to preserve no-anticipation behaviour.
Installation
Install the package from PyPI:
pip install time-causal-vae
Wheel installs include the runtime package only. From a source checkout, use Poetry groups for development tools, local empirical data access, notebooks, and optional tracking:
poetry install --only main
poetry install --with dev
poetry install --with notebooks
poetry install --with data
poetry install --with tracking
The docs URL currently points to the repository documentation directory. No
hosted Sphinx documentation is published yet.
Quickstart
Check the installed package:
python - <<'PY'
import time_causal_vae
print(time_causal_vae.__version__)
PY
Inspect installed command-line entry points:
tcvae-train --help
tcvae-train-tokenizer --help
tcvae-train-token-prior --help
tcvae-evaluate --help
tcvae-select-model --help
Repository examples use configs, scripts, and registry files from the source tree. Clone the repository when running the public workflows:
git clone https://github.com/GVourvachakis/TimeCausalVQVAE.git
cd TimeCausalVQVAE
poetry install --with dev,data
Inspect the public S&P500/VIX registry entry:
poetry run python scripts/select_registered_model.py \
--experiment sp500_vix \
--family discrete
Run a dry-run continuous S&P500/VIX smoke command:
poetry run tcvae-train \
--config configs/experiments/sp500_vix_beta_cvae.yaml \
--output-dir outputs/sp500_vix_continuous \
--epochs 1 \
--no-wandb \
--dry-run
Remove --dry-run only when you intentionally want to train locally.
Public Status
S&P500/VIX is the stable public default one-dimensional workflow.
Hawkes/SVMHJD is an optional research benchmark with research-candidate
metadata. Multidimensional benchmarks are experimental infrastructure, and no
multidimensional model is selected in
trained_models/model_registry.yaml.
Experimental multidimensional profile metadata is kept in
trained_models/multidim_profiles.yaml.
No downloaded data, trained weights, checkpoints, token tensors, generated paths, W&B runs, notebooks with outputs, or local result summaries are shipped with the package.
Stable Benchmarks
| Benchmark | Role | Public status |
|---|---|---|
| S&P500/VIX | Empirical one-dimensional market workflow with VIX conditioning and a local processed data convention. | Public default. Uses local-only processed data and selected continuous/discrete registry metadata. |
| Black-Scholes | Synthetic geometric Brownian motion baseline for smoke tests and one-dimensional generation checks. | Stable baseline config and registry metadata. |
| Heston | Synthetic stochastic-volatility baseline with a latent variance channel. | Stable baseline config and registry metadata. |
| Path-dependent volatility | Conditional synthetic volatility baseline with a prefix volatility feature. | Stable baseline config and registry metadata. |
The selected public S&P500/VIX discrete baseline is a standard causal VQ tokenizer plus an additive scalar-conditioned causal autoregressive token prior:
configs/experiments/sp500_vix_causal_vq_tokenizer.yaml
configs/experiments/sp500_vix_causal_token_prior_additive.yaml
Optional Research Benchmark
| Benchmark | Description | Public status |
|---|---|---|
| Hawkes/SVMHJD | Marked Hawkes jump-diffusion benchmark with Ogata event simulation and fixed-grid observation. | Optional rare-event research benchmark with public_default: false. No weights or generated outputs are committed. |
Experimental Benchmarks
| Benchmark | Description | Public status |
|---|---|---|
| Multifactor market | 50-dimensional low-rank factor market with sector structure and optional common/sector jumps. | Experimental infrastructure for shape, covariance, and no-leakage checks. |
| S&P500 50-stock panel | Local-only yfinance/Yahoo-backed daily 50-stock equity panel. |
Experimental infrastructure. Downloaded Yahoo-backed data must remain local and is not redistributed. |
The benchmark notes live under
docs/benchmarks.
They document the synthetic SDE or simulator specification, empirical data
source conventions, tensor and condition layouts, preprocessing rules, and
local-data boundaries for each workflow.
Benchmark Data Conventions
| Benchmark | Data convention |
|---|---|
| S&P500/VIX | Local processed benchmark data is expected at data/processed/sp500vix/sp500vix_normalized.npy. |
| Hawkes/SVMHJD | Synthetic paths are generated locally from the marked Hawkes jump-diffusion simulator. |
| Multifactor market | Synthetic 50D panels are generated locally from the low-rank sector-factor simulator. |
| S&P500 50-stock panel | Daily panels are downloaded locally through optional yfinance access and must not be redistributed. |
Models And Features
| Area | Included | Release status |
|---|---|---|
| Continuous TC-VAE | No-anticipation continuous VAE baseline, RealNVP-compatible prior paths, and financial dataset conventions. | Stable baseline surface. |
| Causal VQ tokenizers | Causal convolutional tokenizers with vector-quantized latent codes. | Public S&P500/VIX discrete baseline. |
| RVQ and multi-code tokenizers | Residual and multi-code tokenizer infrastructure. | Experimental. No multidimensional model is registry-selected. |
| Token priors | Additive autoregressive priors and causal conv-transformer research variants. | Additive prior is the public S&P500/VIX default; conv-transformer variants are research candidates. |
| Registry metadata | Selected configs, local checkpoint conventions, metrics, caveats, and no-leakage status. | Metadata only. It does not contain weights. |
| Notebook demos | Output-stripped notebooks that print guarded commands and read local outputs when available. | Demonstration only. They should not train or evaluate by default. |
Executed notebook previews are available on the docs/executed-notebook-previews branch. The
main branch keeps notebooks output-stripped for reproducibility and package size. Preview outputs
depend on local artefacts and checkpoints and are not the package source of truth.
Diagnostics
| Diagnostic family | Examples | Notes |
|---|---|---|
| Distributional distances | MMD, sliced Wasserstein, terminal and volatility Wasserstein distances. | Used for registry summaries and model comparison. |
| Path-risk summaries | Drawdown, return autocorrelation, squared-return autocorrelation, VaR, and ES. | Intended for generated-vs-real path checks, not investment advice. |
| Conditional checks | VIX-bucket summaries and prefix-safe condition handling. | Used by the public S&P500/VIX workflow. |
| Token diagnostics | Codebook usage, active codes, token perplexity, transition summaries, and latent geometry. | Used to inspect discrete-token behaviour. |
| Jump diagnostics | Jump count, inter-arrival, jump-size, and lower-tail summaries. | Used by the optional Hawkes/SVMHJD benchmark. |
| Cross-sectional checks | Covariance, correlation, eigenspectrum, sector-block, and portfolio-risk summaries. | Experimental multidimensional infrastructure. |
Local Data Policy
The package does not redistribute empirical market data. The S&P500/VIX data file is expected locally at:
data/processed/sp500vix/sp500vix_normalized.npy
The S&P500 50-stock panel downloader uses optional yfinance access and writes
local raw and processed files under data/raw/ and data/processed/.
Yahoo-backed data is subject to Yahoo's terms and must not be redistributed or
committed.
Generated artefacts belong under local paths such as outputs/, wandb/, or
data/processed/. They are intentionally excluded from the public repository
and package.
Repository Layout
| Path | Purpose |
|---|---|
src/time_causal_vae |
Importable package source. |
configs/experiments |
Repository workflow configs used by scripts and notebooks. |
scripts |
Inspection, extraction, evaluation, no-leakage, and smoke helpers. |
trained_models |
Lightweight registry metadata and model cards only. |
docs/benchmarks |
Public benchmark notes. |
assets/figures |
Small curated README figures generated from local runs. |
notebooks |
Output-stripped demos and report-facing notebooks. |
Background
TimeCausalVAE keeps the no-anticipation contract from upstream TC-VAE: at time
t, encoders, tokenizers, priors, and diagnostics should only use observations
and conditions available up to that point. The public branch preserves the
continuous TC-VAE baseline and adds a discrete two-stage path: causal tokenizer
first, causal token prior second.
The package is research software for generative modelling diagnostics. It is not a calibrated pricing library, a trading system, or a source of financial advice.
Citation And Acknowledgement
This repository refactors selected parts of the original Time-Causal VAE code and extends the public workflow with causal VQ-style discrete latent models. Please cite or acknowledge the relevant upstream work when using the package:
- Time-Causal VAE: Robust Financial Time Series Generator - Beatrice Acciaio, Stephan Eckstein, and Songyan Hou. DOI: 10.48550/arXiv.2411.02947; code: justinhou95/TimeCausalVAE.
- Neural Discrete Representation Learning - Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. DOI: 10.48550/arXiv.1711.00937.
- Vector Quantized Time Series Generation with a Bidirectional Prior Model - Daesoo Lee, Sara Malacarne, and Erlend Aune. DOI: 10.48550/arXiv.2303.04743; code: ML4ITS/TimeVQVAE.
- vector-quantize-pytorch - lucidrains. Repository: lucidrains/vector-quantize-pytorch.
License
This project is released under the GNU General Public License v3. See
LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file time_causal_vae-0.1.1.tar.gz.
File metadata
- Download URL: time_causal_vae-0.1.1.tar.gz
- Upload date:
- Size: 266.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.2.0-39-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b89cbb92239266851269964ac3281557d553b89887b1c4ac6bc01ac1e121ea4d
|
|
| MD5 |
6b05c3fdfdee8fed8592a1b9981d0b91
|
|
| BLAKE2b-256 |
b753467714e6a66f18a79a23809e771f7ca5b0cdc4a84ea298a48269501119f1
|
File details
Details for the file time_causal_vae-0.1.1-py3-none-any.whl.
File metadata
- Download URL: time_causal_vae-0.1.1-py3-none-any.whl
- Upload date:
- Size: 354.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.12.3 Linux/6.2.0-39-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1f4b9b1f5887fcde22e687197a6580c1ed4476b0545135798d251973f2f7a2d4
|
|
| MD5 |
bca95699caf1005eb863c857c9755a6c
|
|
| BLAKE2b-256 |
852e98b2da9c37e9a693ad4660a8b82e912fb79725ff0ddbecaeff0e4a2c8d02
|