Skip to main content

Finance-specific latent-factor and portfolio-learning models for ML4T

Project description

ml4t-models

Python 3.12+ PyPI License: MIT

Finance-native model implementations for latent-factor estimation, stochastic discount factor learning, direct asset prediction, and end-to-end portfolio learning.

Documentation: https://ml4trading.io/docs/models/

Part of the ML4T Library Ecosystem

This library is one of six interconnected ML4T libraries supporting the research and production workflow described in Machine Learning for Trading.

ML4T Library Ecosystem

What This Library Does

ml4t-models packages paper-faithful model families that are common in modern empirical asset pricing and portfolio learning:

  • Latent-factor estimators with explicit structural outputs:
    • PCAModel
    • RPPCAModel
    • IPCAModel
    • CAEModel
  • Weight-native stochastic discount factor modeling:
    • StochasticDiscountFactorModel
  • Direct asset prediction:
    • SAEModel (SAE = supervised autoencoder)
  • End-to-end portfolio learning:
    • LinearFeaturePortfolioModel
    • LSTMPortfolioModel
    • DeepPortfolioModel

The library is built around finance-native contracts rather than generic tensor trainers:

  • PersistentPanelBatch for stable-ID panels
  • CrossSectionBatch for ragged dated cross-sections
  • PortfolioSequenceBatch for sequence-to-allocation models

It also keeps the predictive steps explicit:

  • structural extraction
  • factor-premium forecasting
  • asset mapping
  • downstream prediction and weight frames for ml4t-backtest and ml4t-diagnostic

ml4t-models Architecture

Installation

pip install ml4t-models

Optional extras:

pip install ml4t-models[deep]         # torch-backed neural models
pip install ml4t-models[integration]  # polars + ml4t-specs bridges
pip install ml4t-models[docs]         # mkdocs site build
pip install ml4t-models[all]

Quick Start

1. Latent-Factor Forecast Pipeline

import numpy as np

from ml4t.models import (
    BetaLambdaMapper,
    CrossSectionBatch,
    ExpandingMeanFactorForecaster,
    IPCAConfig,
    IPCAModel,
    LatentFactorForecastPipeline,
)

batch = CrossSectionBatch(
    characteristics=np.random.randn(24, 200, 12),
    returns=np.random.randn(24, 200),
    timestamps=tuple(range(24)),
)

pipeline = LatentFactorForecastPipeline(
    model=IPCAModel(IPCAConfig(n_factors=3)),
    forecaster=ExpandingMeanFactorForecaster(),
    mapper=BetaLambdaMapper(),
)
pipeline.fit(batch)
prediction = pipeline.predict(batch)

print(prediction.asset_forecast.expected_returns.shape)
# (24, 200)

2. Weight-Native Stochastic Discount Factor

import numpy as np

from ml4t.models import CrossSectionBatch, StochasticDiscountFactorConfig, StochasticDiscountFactorModel

batch = CrossSectionBatch(
    characteristics=np.random.randn(36, 300, 16),
    returns=np.random.randn(36, 300),
    context_features=np.random.randn(36, 8),
    timestamps=tuple(range(36)),
)

model = StochasticDiscountFactorModel(
    StochasticDiscountFactorConfig(checkpoint_epochs=(256, 512, 768, 1024, 1280))
)
model.fit(batch)
state = model.extract(batch, checkpoint=1280)

print(state.asset_weights.shape)
# (36, 300)

3. End-to-End Portfolio Learning

import numpy as np

from ml4t.models import LSTMPortfolioConfig, LSTMPortfolioModel, PortfolioSequenceBatch

batch = PortfolioSequenceBatch(
    features=np.random.randn(8, 63, 20, 10),
    returns=np.random.randn(8, 63, 20),
    timestamps=tuple(range(63)),
    asset_ids=tuple(f"asset_{i}" for i in range(20)),
)

model = LSTMPortfolioModel(LSTMPortfolioConfig(max_iters=20, checkpoint_every=5))
model.fit(batch)
weights = model.predict(batch, checkpoint=20)

print(weights.weights.shape)
# (8, 63, 20)

4. Hand Off Predictions To The Rest Of ML4T

from ml4t.models import predictions_frame_from_asset_forecast, write_backtest_frames

frame = predictions_frame_from_asset_forecast(prediction.asset_forecast)
write_backtest_frames("artifacts/run_001", predictions=frame)

Model Families

Latent Factors

These models estimate a structural representation first, then let a separate forecaster produce ex ante factor premia.

Model Contract Native output Predictive step
PCAModel PersistentPanelBatch static loadings, factor returns factor-premium forecaster + mapper
RPPCAModel PersistentPanelBatch risk-premium-aware latent factors factor-premium forecaster + mapper
IPCAModel CrossSectionBatch characteristic-implied betas, factor history factor-premium forecaster + mapper
CAEModel CrossSectionBatch nonlinear characteristic betas, factor history factor-premium forecaster + mapper

Stochastic Discount Factor

StochasticDiscountFactorModel is not a beta × lambda latent-factor model. It learns a weight-native no-arbitrage object and exposes:

  • asset weights
  • SDF series
  • checkpointed phase-aware training state

Optional return projections are handled by separate mappers.

Direct Asset Prediction

SAEModel is a supervised autoencoder signal model. In this library it is treated as a direct predictor, not a latent-factor model.

Portfolio Learning

Portfolio models learn allocations directly:

  • LinearFeaturePortfolioModel as a deterministic baseline
  • LSTMPortfolioModel as a sequence baseline
  • DeepPortfolioModel as a structured DeePM-style allocator

Design Principles

  • Finance-native data contracts rather than generic dataloaders
  • Explicit structural and predictive stages
  • Checkpoint-aware neural training
  • Clear separation between:
    • model estimation
    • forecasting
    • backtest and diagnostic integration
  • Integration boundaries with sibling libraries instead of duplicated evaluation logic

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml4t_models-0.1.0a3.tar.gz (60.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml4t_models-0.1.0a3-py3-none-any.whl (77.5 kB view details)

Uploaded Python 3

File details

Details for the file ml4t_models-0.1.0a3.tar.gz.

File metadata

  • Download URL: ml4t_models-0.1.0a3.tar.gz
  • Upload date:
  • Size: 60.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a3.tar.gz
Algorithm Hash digest
SHA256 543e8ced015b9e893e92b4b10004c93e41d686b79c48d00d033403ddf4a4893a
MD5 eb4cabcb421658691b6f7f119c57966e
BLAKE2b-256 5e0b557ce452e3fa0281010166b8cfac5cef37aadc1d19e6c6e45e1469562fad

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a3.tar.gz:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ml4t_models-0.1.0a3-py3-none-any.whl.

File metadata

  • Download URL: ml4t_models-0.1.0a3-py3-none-any.whl
  • Upload date:
  • Size: 77.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a3-py3-none-any.whl
Algorithm Hash digest
SHA256 a07b166b53d521f1b56a57da6185cac293bf75e2d56c82c1c2d8e1f31a28d56b
MD5 feadf5ca32c54fb493df4dd73f837ee2
BLAKE2b-256 7642d8b56c5034485bf19f34a05c6bc7417aa43ed1eee52609f5256aea430133

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a3-py3-none-any.whl:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page