Empirical asset pricing toolkit: ML prediction, factor models, cross-sectional tests, SDF/GMM

These details have not been verified by PyPI

Project links

Project description

eapctf: Empirical Asset Pricing Toolkit

A Python package for empirical asset pricing research, covering factor construction, cross-sectional tests, SDF/GMM estimation, ML-based return prediction, and portfolio optimization.

Installation

# with uv (recommended)
uv add eapctf

# with pip
pip install eapctf

Optional ML extras (PyTorch, LightGBM):

uv add "eapctf[ml]"

Quick Start

All examples assume a long-format panel DataFrame data with columns date, permno, ret, mktcap, exchcd, and any characteristic columns.

1. Portfolio Sorting

from eapctf.sorting import univariate_sort, bivariate_sort, ff3_factors, ff5_factors

# Decile sort on book-to-market with NYSE breakpoints (JKP micro-cap filter on by default)
result = univariate_sort(data, char_col="bm", n_portfolios=10, weighting="vw")
print(result.portfolio_returns)   # DataFrame: date x [port_1, ..., port_10, long_short]
print(result.portfolio_stats)     # mean, std, Sharpe, t-stat per decile

# Fama-French 3-factor model
ff3 = ff3_factors(data, bm_col="bm", rf_col="rf")
print(ff3.factors)   # DataFrame: date x [mkt_rf, smb, hml]

# Fama-French 5-factor model
ff5 = ff5_factors(data, bm_col="bm", op_col="op", inv_col="inv", rf_col="rf")
print(ff5.factors)   # DataFrame: date x [mkt_rf, smb, hml, rmw, cma]

2. Fama-MacBeth Cross-Sectional Regression

from eapctf.crosssection import fama_macbeth

# Characteristic-based FM: cross-sectional regression of ret on chars each period
result = fama_macbeth(data, char_cols=["bm", "size", "mom"])
print(result.lambdas[["coef", "t_shanken"]])  # risk premia with Shanken-corrected t-stats
print(result.r_squared)                        # time-series average cross-sectional R²

# Factor-based FM (two-pass): estimate betas first, then price them
result2 = fama_macbeth(data, factor_cols=["mkt_rf", "smb", "hml"])

3. Time-Series Alpha and GRS Test

from eapctf.timeseries import time_series_alpha, grs_test

# Single portfolio or multiple portfolios (DataFrame)
alpha_res = time_series_alpha(result.portfolio_returns, ff3.factors)
# Returns AlphaResult (single) or list[AlphaResult] (multiple)
print(alpha_res.alpha)    # intercept
print(alpha_res.alpha_t)  # Newey-West t-statistic

# GRS test: are all portfolio alphas jointly zero?
grs = grs_test(result.portfolio_returns, ff3.factors)
print(grs.statistic, grs.p_value)

4. SDF / GMM Estimation

from eapctf.sdf import gmm_estimate, hj_distance, hj_bounds

# Two-step efficient GMM (default)
gmm = gmm_estimate(port_ret, ff3.factors, two_step=True)
print(gmm.b)           # SDF loadings (K,)
print(gmm.t_stats)     # t-statistics
print(gmm.j_statistic, gmm.j_p_value)  # overidentification J-test

# HJ distance: pass a pre-computed SDF proxy (e.g., from GMM)
f_demeaned = ff3.factors.sub(ff3.factors.mean())
sdf_proxy = 1 - f_demeaned.values @ gmm.b
hj = hj_distance(port_ret, pd.Series(sdf_proxy, index=port_ret.index))
print(hj.distance)

# HJ volatility bounds
bounds = hj_bounds(port_ret)

5. ML Out-of-Sample Return Prediction

from eapctf.predict import expanding_window_oos, make_predictor

model = make_predictor("lasso", alpha=0.01)
oos = expanding_window_oos(
    data,
    char_cols=["bm", "size", "mom", "op", "inv"],
    models=[model],
    train_min_periods=240,   # minimum 20 years of training data
)
print(oos.oos_r2)              # OOS R² averaged across models (GKX 2020)
print(oos.oos_r2_by_model)     # OOS R² per model

6. Portfolio Optimization

from eapctf.sorting import long_short_portfolio
from eapctf.portfolio import mean_variance_weights, hrp_weights

# Long-short portfolio from a signal
ls = long_short_portfolio(data, signal_col="bm", n_portfolios=10, weighting="vw")
print(ls.returns["long_short"])   # long-short return series
print(ls.metrics)                 # mean, std, Sharpe, etc.

# Mean-variance optimization
weights = mean_variance_weights(
    expected_returns=mu,
    cov_matrix_input=sigma,
    method="max_sharpe",
)

# Hierarchical Risk Parity
weights_hrp = hrp_weights(returns_data=port_ret)

CTF (Competition to Forecast)

eapctf.ctf provides a local replication pipeline for the Common Task Framework introduced in Hoberg, Jensen, Kelly & Pedersen (2025). The CTF evaluates portfolio strategies on a shared holdout test set across 402 firm characteristics (153 JKP + 249 additional GFD factors).

Pipeline

from eapctf.ctf import run_local, compute_metrics, validate

# 1. Run a CTF model script locally
weights = run_local("models/my-model.py", data_dir="data/ctf/")

# 2. Evaluate performance (10% vol-targeting matches CTF server methodology)
daily_ret = pd.read_parquet("data/ctf/ctff_daily_ret.parquet")
metrics = compute_metrics(weights, daily_ret, vol_target=0.10)
print(metrics)

# 3. Check compliance before submission
report = validate("models/my-model.py", data_dir="data/ctf/")
print(report)

Starting a New Model

cp reference/template-ctf-model.py models/my-model.py
# Edit models/my-model.py — replace TODO sections with your implementation

The template provides a complete rolling-window train/predict loop with rank-normalized features, OLS prediction, and z-score portfolio weights. Replace train_model() / predict_returns() / construct_weights() with your approach; the rest of the pipeline stays the same.

Replication Results

The table below shows eap.ctf.compute_metrics() output against known CTF leaderboard entries, confirming that local evaluation with vol_target=0.10 reproduces CTF server metrics closely. All returns are scaled to 10% annualized volatility before computing statistics (CTF standard).

Model	Sharpe (local)	Sharpe (CTF)	Diff %	Annual Return	Vol	Max Drawdown
1/N (equal weight)	0.551	0.491	+12.2%	5.13%	10.00%	-30.43%
IPCA (KPS 2019)	1.939	1.948	-0.5%	20.64%	10.00%	-10.30%

The IPCA replication uses the parallelized benchmark script at reference/benchmark-ipca-pf.py (n_factors=5, window=120 months, 402 features, 408 test dates). The Sharpe replicates within 0.5% of the CTF leaderboard value; the 1/N discrepancy reflects minor differences in stock universe filtering conventions between local evaluation and the CTF server.

Module Overview

Module	Key Functions	Reference
`eapctf.ctf`	`run_local`, `compute_metrics`, `validate`, `fetch_leaderboard`, `pipeline`, `download_ctf_data`	Hoberg, Jensen, Kelly & Pedersen (2025)
`eapctf.sorting`	`univariate_sort`, `bivariate_sort`, `char_factor`, `ff3_factors`, `ff5_factors`, `hxz4_factors`, `sy4_factors`, `mom_factor`, `long_short_portfolio`	Fama & French (1993, 2015); Hou, Xue & Zhang (2015); Stambaugh & Yuan (2017)
`eapctf.crosssection`	`fama_macbeth`, `cs_regression`, `multiple_testing_correction`	Fama & MacBeth (1973); Shanken (1992)
`eapctf.timeseries`	`time_series_alpha`, `grs_test`, `spanning_test`, `rolling_beta`	Gibbons, Ross & Shanken (1989)
`eapctf.sdf`	`gmm_estimate`, `hj_distance`, `hj_bounds`, `pricing_errors`	Hansen (1982); Hansen & Jagannathan (1991)
`eapctf.predict`	`expanding_window_oos`, `make_predictor`, `char_prep`	Gu, Kelly & Xiu (2020)
`eapctf.portfolio`	`mean_variance_weights`, `hrp_weights`, `black_litterman_weights`, `ParametricPolicy`, `evaluate_portfolio`	Markowitz (1952); Lopez de Prado (2016)
`eapctf.utils`	`rank_normalize`, `classify`, `EAPPanel`, `JKP_153`, `load_gfd_chars`	—

Development

# install with dev dependencies
uv sync --dev

# run tests
uv run python -m pytest

# lint and type check
uv run ruff check eapctf/
uv run mypy eapctf/ --ignore-missing-imports

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.1

Mar 28, 2026

0.3.0

Mar 28, 2026

0.2.0

Mar 27, 2026

This version

0.1.0

Mar 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eapctf-0.1.0.tar.gz (8.5 MB view details)

Uploaded Mar 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

eapctf-0.1.0-py3-none-any.whl (132.4 kB view details)

Uploaded Mar 27, 2026 Python 3

File details

Details for the file eapctf-0.1.0.tar.gz.

File metadata

Download URL: eapctf-0.1.0.tar.gz
Upload date: Mar 27, 2026
Size: 8.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.0

File hashes

Hashes for eapctf-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f8b67fddcbfbce8628f451459f041c66dfeff4f45bbf7d70b4eadfeb9f624ca6`
MD5	`712174a56088a4eed7c3cd8b9e6f8aa7`
BLAKE2b-256	`3c162dd8b395bad2d0d658329fe1d5bf740b2e7efdcb32d31aed632a4ef7c0d0`

See more details on using hashes here.

File details

Details for the file eapctf-0.1.0-py3-none-any.whl.

File metadata

Download URL: eapctf-0.1.0-py3-none-any.whl
Upload date: Mar 27, 2026
Size: 132.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.0

File hashes

Hashes for eapctf-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ad74307fe40d0aeaebe63ce191ba308d80da7d1e0c8f88c675c215818a16b4c3`
MD5	`22d47bc75a1a72ebae8e25774647f860`
BLAKE2b-256	`d4af6df918b3f7d7db01f8a54f26d6a1fc218800f3d555bda3f0f11e427ad279`

See more details on using hashes here.

eapctf 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

eapctf: Empirical Asset Pricing Toolkit

Installation

Quick Start

1. Portfolio Sorting

2. Fama-MacBeth Cross-Sectional Regression

3. Time-Series Alpha and GRS Test

4. SDF / GMM Estimation

5. ML Out-of-Sample Return Prediction

6. Portfolio Optimization

CTF (Competition to Forecast)

Pipeline

Starting a New Model

Replication Results

Module Overview

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes