Skip to main content

Finance-specific latent-factor and portfolio-learning models for ML4T

Project description

ml4t-models

Python 3.12+ PyPI License: MIT

Finance-native model implementations for latent-factor estimation, stochastic discount factor learning, direct asset prediction, and end-to-end portfolio learning.

Documentation: https://ml4trading.io/docs/models/

Part of the ML4T Library Ecosystem

This library is one of six interconnected ML4T libraries supporting the research and production workflow described in Machine Learning for Trading.

ML4T Library Ecosystem

What This Library Does

ml4t-models packages paper-faithful model families that are common in modern empirical asset pricing and portfolio learning:

  • Latent-factor estimators with explicit structural outputs:
    • PCAModel
    • RPPCAModel
    • IPCAModel
    • CAEModel
  • Weight-native stochastic discount factor modeling:
    • StochasticDiscountFactorModel
  • Direct asset prediction:
    • SAEModel (SAE = supervised autoencoder)
  • End-to-end portfolio learning:
    • LinearFeaturePortfolioModel
    • LSTMPortfolioModel
    • DeepPortfolioModel

The library is built around finance-native contracts rather than generic tensor trainers:

  • PersistentPanelBatch for stable-ID panels
  • CrossSectionBatch for ragged dated cross-sections
  • PortfolioSequenceBatch for sequence-to-allocation models

It also keeps the predictive steps explicit:

  • structural extraction
  • factor-premium forecasting
  • asset mapping
  • downstream prediction and weight frames for ml4t-backtest and ml4t-diagnostic

ml4t-models Architecture

Installation

pip install ml4t-models

Optional extras:

pip install ml4t-models[deep]         # torch-backed neural models
pip install ml4t-models[integration]  # polars + ml4t-specs bridges
pip install ml4t-models[docs]         # mkdocs site build
pip install ml4t-models[all]

Quick Start

1. Latent-Factor Forecast Pipeline

import numpy as np

from ml4t.models import (
    BetaLambdaMapper,
    CrossSectionBatch,
    ExpandingMeanFactorForecaster,
    IPCAConfig,
    IPCAModel,
    LatentFactorForecastPipeline,
)

batch = CrossSectionBatch(
    characteristics=np.random.randn(24, 200, 12),
    returns=np.random.randn(24, 200),
    timestamps=tuple(range(24)),
)

pipeline = LatentFactorForecastPipeline(
    model=IPCAModel(IPCAConfig(n_factors=3)),
    forecaster=ExpandingMeanFactorForecaster(),
    mapper=BetaLambdaMapper(),
)
pipeline.fit(batch)
prediction = pipeline.predict(batch)

print(prediction.asset_forecast.expected_returns.shape)
# (24, 200)

2. Weight-Native Stochastic Discount Factor

import numpy as np

from ml4t.models import CrossSectionBatch, StochasticDiscountFactorConfig, StochasticDiscountFactorModel

batch = CrossSectionBatch(
    characteristics=np.random.randn(36, 300, 16),
    returns=np.random.randn(36, 300),
    context_features=np.random.randn(36, 8),
    timestamps=tuple(range(36)),
)

model = StochasticDiscountFactorModel(
    StochasticDiscountFactorConfig(checkpoint_epochs=(256, 512, 768, 1024, 1280))
)
model.fit(batch)
state = model.extract(batch, checkpoint=1280)

print(state.asset_weights.shape)
# (36, 300)

3. End-to-End Portfolio Learning

import numpy as np

from ml4t.models import LSTMPortfolioConfig, LSTMPortfolioModel, PortfolioSequenceBatch

batch = PortfolioSequenceBatch(
    features=np.random.randn(8, 63, 20, 10),
    returns=np.random.randn(8, 63, 20),
    timestamps=tuple(range(63)),
    asset_ids=tuple(f"asset_{i}" for i in range(20)),
)

model = LSTMPortfolioModel(LSTMPortfolioConfig(max_iters=20, checkpoint_every=5))
model.fit(batch)
weights = model.predict(batch, checkpoint=20)

print(weights.weights.shape)
# (8, 63, 20)

4. Hand Off Predictions To The Rest Of ML4T

from ml4t.models import predictions_frame_from_asset_forecast, write_backtest_frames

frame = predictions_frame_from_asset_forecast(prediction.asset_forecast)
write_backtest_frames("artifacts/run_001", predictions=frame)

Model Families

Latent Factors

These models estimate a structural representation first, then let a separate forecaster produce ex ante factor premia.

Model Contract Native output Predictive step
PCAModel PersistentPanelBatch static loadings, factor returns factor-premium forecaster + mapper
RPPCAModel PersistentPanelBatch risk-premium-aware latent factors factor-premium forecaster + mapper
IPCAModel CrossSectionBatch characteristic-implied betas, factor history factor-premium forecaster + mapper
CAEModel CrossSectionBatch nonlinear characteristic betas, factor history factor-premium forecaster + mapper

Stochastic Discount Factor

StochasticDiscountFactorModel is not a beta × lambda latent-factor model. It learns a weight-native no-arbitrage object and exposes:

  • asset weights
  • SDF series
  • checkpointed phase-aware training state

Optional return projections are handled by separate mappers.

Direct Asset Prediction

SAEModel is a supervised autoencoder signal model. In this library it is treated as a direct predictor, not a latent-factor model.

Portfolio Learning

Portfolio models learn allocations directly:

  • LinearFeaturePortfolioModel as a deterministic baseline
  • LSTMPortfolioModel as a sequence baseline
  • DeepPortfolioModel as a structured DeePM-style allocator

Design Principles

  • Finance-native data contracts rather than generic dataloaders
  • Explicit structural and predictive stages
  • Checkpoint-aware neural training
  • Clear separation between:
    • model estimation
    • forecasting
    • backtest and diagnostic integration
  • Integration boundaries with sibling libraries instead of duplicated evaluation logic

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml4t_models-0.1.0a2.tar.gz (59.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml4t_models-0.1.0a2-py3-none-any.whl (76.4 kB view details)

Uploaded Python 3

File details

Details for the file ml4t_models-0.1.0a2.tar.gz.

File metadata

  • Download URL: ml4t_models-0.1.0a2.tar.gz
  • Upload date:
  • Size: 59.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 faf01607e086184358f68b7dab1ddd767cb3bdb8d999bd780588d8fd0d8cf0d1
MD5 ff810a51c07ab5d48e1beb97321610bd
BLAKE2b-256 cb949fd3432f941fcdb57f7f9b547c29fa3be8091b177e2c2689f48c78db890f

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a2.tar.gz:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ml4t_models-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: ml4t_models-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 76.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 4627ad0f30787a4d45bb24abea17ee22a1e65f7494e3de9bcadaa7945785096b
MD5 82bc25194f3e1f5ffb19b075efa11ce2
BLAKE2b-256 5215db1179aecf0ef05b64dfbb2a3023515fb7207c1cc5b090b5087a57975803

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a2-py3-none-any.whl:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page