Skip to main content

Finance-specific latent-factor and portfolio-learning models for ML4T

Project description

ml4t-models

Python 3.12+ PyPI License: MIT

Finance-native model implementations for latent-factor estimation, stochastic discount factor learning, direct asset prediction, and end-to-end portfolio learning.

Documentation: https://ml4trading.io/docs/models/

Part of the ML4T Library Ecosystem

This library is one of six interconnected ML4T libraries supporting the research and production workflow described in Machine Learning for Trading.

ML4T Library Ecosystem

What This Library Does

ml4t-models packages paper-faithful model families that are common in modern empirical asset pricing and portfolio learning:

  • Latent-factor estimators with explicit structural outputs:
    • PCAModel
    • RPPCAModel
    • IPCAModel
    • CAEModel
  • Weight-native stochastic discount factor modeling:
    • StochasticDiscountFactorModel
  • Direct asset prediction:
    • SAEModel (SAE = supervised autoencoder)
  • End-to-end portfolio learning:
    • LinearFeaturePortfolioModel
    • LSTMPortfolioModel
    • DeepPortfolioModel

The library is built around finance-native contracts rather than generic tensor trainers:

  • PersistentPanelBatch for stable-ID panels
  • CrossSectionBatch for ragged dated cross-sections
  • PortfolioSequenceBatch for sequence-to-allocation models

It also keeps the predictive steps explicit:

  • structural extraction
  • factor-premium forecasting
  • asset mapping
  • downstream prediction and weight frames for ml4t-backtest and ml4t-diagnostic

ml4t-models Architecture

Installation

pip install ml4t-models

Optional extras:

pip install ml4t-models[deep]         # torch-backed neural models
pip install ml4t-models[integration]  # polars + ml4t-specs bridges
pip install ml4t-models[docs]         # mkdocs site build
pip install ml4t-models[all]

Quick Start

1. Latent-Factor Forecast Pipeline

import numpy as np

from ml4t.models import (
    BetaLambdaMapper,
    CrossSectionBatch,
    ExpandingMeanFactorForecaster,
    IPCAConfig,
    IPCAModel,
    LatentFactorForecastPipeline,
)

batch = CrossSectionBatch(
    characteristics=np.random.randn(24, 200, 12),
    returns=np.random.randn(24, 200),
    timestamps=tuple(range(24)),
)

pipeline = LatentFactorForecastPipeline(
    model=IPCAModel(IPCAConfig(n_factors=3)),
    forecaster=ExpandingMeanFactorForecaster(),
    mapper=BetaLambdaMapper(),
)
pipeline.fit(batch)
prediction = pipeline.predict(batch)

print(prediction.asset_forecast.expected_returns.shape)
# (24, 200)

2. Weight-Native Stochastic Discount Factor

import numpy as np

from ml4t.models import CrossSectionBatch, StochasticDiscountFactorConfig, StochasticDiscountFactorModel

batch = CrossSectionBatch(
    characteristics=np.random.randn(36, 300, 16),
    returns=np.random.randn(36, 300),
    context_features=np.random.randn(36, 8),
    timestamps=tuple(range(36)),
)

model = StochasticDiscountFactorModel(
    StochasticDiscountFactorConfig(checkpoint_epochs=(256, 512, 768, 1024, 1280))
)
model.fit(batch)
state = model.extract(batch, checkpoint=1280)

print(state.asset_weights.shape)
# (36, 300)

3. End-to-End Portfolio Learning

import numpy as np

from ml4t.models import LSTMPortfolioConfig, LSTMPortfolioModel, PortfolioSequenceBatch

batch = PortfolioSequenceBatch(
    features=np.random.randn(8, 63, 20, 10),
    returns=np.random.randn(8, 63, 20),
    timestamps=tuple(range(63)),
    asset_ids=tuple(f"asset_{i}" for i in range(20)),
)

model = LSTMPortfolioModel(LSTMPortfolioConfig(max_iters=20, checkpoint_every=5))
model.fit(batch)
weights = model.predict(batch, checkpoint=20)

print(weights.weights.shape)
# (8, 63, 20)

4. Hand Off Predictions To The Rest Of ML4T

from ml4t.models import predictions_frame_from_asset_forecast, write_backtest_frames

frame = predictions_frame_from_asset_forecast(prediction.asset_forecast)
write_backtest_frames("artifacts/run_001", predictions=frame)

Model Families

Latent Factors

These models estimate a structural representation first, then let a separate forecaster produce ex ante factor premia.

Model Contract Native output Predictive step
PCAModel PersistentPanelBatch static loadings, factor returns factor-premium forecaster + mapper
RPPCAModel PersistentPanelBatch risk-premium-aware latent factors factor-premium forecaster + mapper
IPCAModel CrossSectionBatch characteristic-implied betas, factor history factor-premium forecaster + mapper
CAEModel CrossSectionBatch nonlinear characteristic betas, factor history factor-premium forecaster + mapper

Stochastic Discount Factor

StochasticDiscountFactorModel is not a beta × lambda latent-factor model. It learns a weight-native no-arbitrage object and exposes:

  • asset weights
  • SDF series
  • checkpointed phase-aware training state

Optional return projections are handled by separate mappers.

Direct Asset Prediction

SAEModel is a supervised autoencoder signal model. In this library it is treated as a direct predictor, not a latent-factor model.

Portfolio Learning

Portfolio models learn allocations directly:

  • LinearFeaturePortfolioModel as a deterministic baseline
  • LSTMPortfolioModel as a sequence baseline
  • DeepPortfolioModel as a structured DeePM-style allocator

Design Principles

  • Finance-native data contracts rather than generic dataloaders
  • Explicit structural and predictive stages
  • Checkpoint-aware neural training
  • Clear separation between:
    • model estimation
    • forecasting
    • backtest and diagnostic integration
  • Integration boundaries with sibling libraries instead of duplicated evaluation logic

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml4t_models-0.1.0a0.tar.gz (55.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml4t_models-0.1.0a0-py3-none-any.whl (75.2 kB view details)

Uploaded Python 3

File details

Details for the file ml4t_models-0.1.0a0.tar.gz.

File metadata

  • Download URL: ml4t_models-0.1.0a0.tar.gz
  • Upload date:
  • Size: 55.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ml4t_models-0.1.0a0.tar.gz
Algorithm Hash digest
SHA256 d5028b090aaba7e65798282bb94a698005ab46d90077f13eae30e695d508057c
MD5 c1b564fa9b0055205214902d02a3bb96
BLAKE2b-256 1ede0ec24efc31694ce4a56fa24825c16a8b5f3b526d7bc5c70aa188555c3588

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a0.tar.gz:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ml4t_models-0.1.0a0-py3-none-any.whl.

File metadata

  • Download URL: ml4t_models-0.1.0a0-py3-none-any.whl
  • Upload date:
  • Size: 75.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for ml4t_models-0.1.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc8433dbff6549dfe8b9abe29eccd2a14c4495f51503861662b416588d07633e
MD5 b25a038af692b122775831eb32bc6324
BLAKE2b-256 7277c6c709f05ed8129d92eb405aba9a53d170f89658fd687971b6098e6e5f88

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a0-py3-none-any.whl:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page