Skip to main content

Finance-specific latent-factor and portfolio-learning models for ML4T

Project description

ml4t-models

Python 3.12+ PyPI License: MIT

Finance-native model implementations for latent-factor estimation, stochastic discount factor learning, direct asset prediction, and end-to-end portfolio learning.

Documentation: https://ml4trading.io/docs/models/

Part of the ML4T Library Ecosystem

This library is one of six interconnected ML4T libraries supporting the research and production workflow described in Machine Learning for Trading.

ML4T Library Ecosystem

What This Library Does

ml4t-models packages paper-faithful model families that are common in modern empirical asset pricing and portfolio learning:

  • Latent-factor estimators with explicit structural outputs:
    • PCAModel
    • RPPCAModel
    • IPCAModel
    • CAEModel
  • Weight-native stochastic discount factor modeling:
    • StochasticDiscountFactorModel
  • Direct asset prediction:
    • SAEModel (SAE = supervised autoencoder)
  • End-to-end portfolio learning:
    • LinearFeaturePortfolioModel
    • LSTMPortfolioModel
    • DeepPortfolioModel

The library is built around finance-native contracts rather than generic tensor trainers:

  • PersistentPanelBatch for stable-ID panels
  • CrossSectionBatch for ragged dated cross-sections
  • PortfolioSequenceBatch for sequence-to-allocation models

It also keeps the predictive steps explicit:

  • structural extraction
  • factor-premium forecasting
  • asset mapping
  • downstream prediction and weight frames for ml4t-backtest and ml4t-diagnostic

ml4t-models Architecture

Installation

pip install ml4t-models

Optional extras:

pip install ml4t-models[deep]         # torch-backed neural models
pip install ml4t-models[integration]  # polars + ml4t-specs bridges
pip install ml4t-models[docs]         # mkdocs site build
pip install ml4t-models[all]

Quick Start

1. Latent-Factor Forecast Pipeline

import numpy as np

from ml4t.models import (
    BetaLambdaMapper,
    CrossSectionBatch,
    ExpandingMeanFactorForecaster,
    IPCAConfig,
    IPCAModel,
    LatentFactorForecastPipeline,
)

batch = CrossSectionBatch(
    characteristics=np.random.randn(24, 200, 12),
    returns=np.random.randn(24, 200),
    timestamps=tuple(range(24)),
)

pipeline = LatentFactorForecastPipeline(
    model=IPCAModel(IPCAConfig(n_factors=3)),
    forecaster=ExpandingMeanFactorForecaster(),
    mapper=BetaLambdaMapper(),
)
pipeline.fit(batch)
prediction = pipeline.predict(batch)

print(prediction.asset_forecast.expected_returns.shape)
# (24, 200)

2. Weight-Native Stochastic Discount Factor

import numpy as np

from ml4t.models import CrossSectionBatch, StochasticDiscountFactorConfig, StochasticDiscountFactorModel

batch = CrossSectionBatch(
    characteristics=np.random.randn(36, 300, 16),
    returns=np.random.randn(36, 300),
    context_features=np.random.randn(36, 8),
    timestamps=tuple(range(36)),
)

model = StochasticDiscountFactorModel(
    StochasticDiscountFactorConfig(checkpoint_epochs=(256, 512, 768, 1024, 1280))
)
model.fit(batch)
state = model.extract(batch, checkpoint=1280)

print(state.asset_weights.shape)
# (36, 300)

3. End-to-End Portfolio Learning

import numpy as np

from ml4t.models import LSTMPortfolioConfig, LSTMPortfolioModel, PortfolioSequenceBatch

batch = PortfolioSequenceBatch(
    features=np.random.randn(8, 63, 20, 10),
    returns=np.random.randn(8, 63, 20),
    timestamps=tuple(range(63)),
    asset_ids=tuple(f"asset_{i}" for i in range(20)),
)

model = LSTMPortfolioModel(LSTMPortfolioConfig(max_iters=20, checkpoint_every=5))
model.fit(batch)
weights = model.predict(batch, checkpoint=20)

print(weights.weights.shape)
# (8, 63, 20)

4. Hand Off Predictions To The Rest Of ML4T

from ml4t.models import predictions_frame_from_asset_forecast, write_backtest_frames

frame = predictions_frame_from_asset_forecast(prediction.asset_forecast)
write_backtest_frames("artifacts/run_001", predictions=frame)

Model Families

Latent Factors

These models estimate a structural representation first, then let a separate forecaster produce ex ante factor premia.

Model Contract Native output Predictive step
PCAModel PersistentPanelBatch static loadings, factor returns factor-premium forecaster + mapper
RPPCAModel PersistentPanelBatch risk-premium-aware latent factors factor-premium forecaster + mapper
IPCAModel CrossSectionBatch characteristic-implied betas, factor history factor-premium forecaster + mapper
CAEModel CrossSectionBatch nonlinear characteristic betas, factor history factor-premium forecaster + mapper

Stochastic Discount Factor

StochasticDiscountFactorModel is not a beta × lambda latent-factor model. It learns a weight-native no-arbitrage object and exposes:

  • asset weights
  • SDF series
  • checkpointed phase-aware training state

Optional return projections are handled by separate mappers.

Direct Asset Prediction

SAEModel is a supervised autoencoder signal model. In this library it is treated as a direct predictor, not a latent-factor model.

Portfolio Learning

Portfolio models learn allocations directly:

  • LinearFeaturePortfolioModel as a deterministic baseline
  • LSTMPortfolioModel as a sequence baseline
  • DeepPortfolioModel as a structured DeePM-style allocator

Design Principles

  • Finance-native data contracts rather than generic dataloaders
  • Explicit structural and predictive stages
  • Checkpoint-aware neural training
  • Clear separation between:
    • model estimation
    • forecasting
    • backtest and diagnostic integration
  • Integration boundaries with sibling libraries instead of duplicated evaluation logic

Documentation

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ml4t_models-0.1.0a1.tar.gz (56.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ml4t_models-0.1.0a1-py3-none-any.whl (75.2 kB view details)

Uploaded Python 3

File details

Details for the file ml4t_models-0.1.0a1.tar.gz.

File metadata

  • Download URL: ml4t_models-0.1.0a1.tar.gz
  • Upload date:
  • Size: 56.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a1.tar.gz
Algorithm Hash digest
SHA256 47972dd877d6a59adcd0acfc4d4533a58236bfaab345ac0b1decd7a3ca1d3120
MD5 eaf4d2cca0fd006059f651f9d8fce2e7
BLAKE2b-256 563ca8651f1e8e05db121f2f6f723991307702c64ec5553aa98fa0f8c68fb103

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a1.tar.gz:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ml4t_models-0.1.0a1-py3-none-any.whl.

File metadata

  • Download URL: ml4t_models-0.1.0a1-py3-none-any.whl
  • Upload date:
  • Size: 75.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for ml4t_models-0.1.0a1-py3-none-any.whl
Algorithm Hash digest
SHA256 e826d122331115f2795c9e2174e1f261359a0abc268e469084be3a4b0a61183f
MD5 fb97dc5c77b389b12288921c786d1d28
BLAKE2b-256 a3a75a6a7355ef52aea08c26ad49cd700b47c427c386115a0847fa9bce9fc3fc

See more details on using hashes here.

Provenance

The following attestation bundles were made for ml4t_models-0.1.0a1-py3-none-any.whl:

Publisher: release.yml on ml4t/models

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page