ML-based portfolio selection from historical price patterns (Murray, Xia, Xiao 2024)
Project description
Charting by Machines
A Python package reproducing the ML-based portfolio selection methodology from "Charting by Machines" by Murray, Xia, and Xiao (2024, Journal of Financial Economics).
Overview
This package implements a machine learning approach to test the efficient market hypothesis by forecasting stock returns from historical price patterns. The methodology uses CNN-LSTM neural networks to generate return forecasts that strongly predict the cross-section of future stock returns.
Key Features
- Multiple ML Architectures: FNN, CNN, LSTM, CNN-LSTM
- Flexible Data Sources: Yahoo Finance, WRDS (CRSP), local files
- Portfolio Construction: Univariate, bivariate, and trivariate quintile sorting
- Risk Analysis: Factor models (CAPM, FF3, FF5, Carhart), Sharpe ratios
- Experiment Tracking: MLflow integration for reproducibility
- Multiple Interfaces: Python API, CLI, Jupyter notebooks
Installation
# Using pip
pip install charting-by-machines
# Using poetry (recommended for development)
git clone https://github.com/yourusername/charting-by-machines.git
cd charting-by-machines
poetry install
Quick Start
from cbm import PortfolioEngine
# Initialize the engine
engine = PortfolioEngine()
# Load data (using Yahoo Finance)
engine.load_data(
tickers=["AAPL", "MSFT", "GOOGL", ...], # or use universe="sp500"
start_date="2010-01-01",
end_date="2023-12-31"
)
# Train the CNN-LSTM model
model_id = engine.train_model(
architecture="cnn_lstm",
loss_function="mse",
weighting="ewpm",
optimization_period=("2010-01", "2018-12")
)
# Generate forecasts
forecasts = engine.forecast(model_id=model_id)
# Construct portfolios sorted by ML forecasts
portfolios = engine.construct_portfolios(
forecasts=forecasts,
n_portfolios=10,
weighting="value"
)
# Analyze performance
performance = engine.analyze_performance(portfolios)
print(performance.summary())
CLI Usage
# Train a model
cbm train --config config/default.yaml
# Generate forecasts
cbm forecast --model-id <model_id> --output forecasts.parquet
# Run backtest
cbm backtest --config config/backtest.yaml
Methodology
Based on Murray, Xia, and Xiao (2024), the package:
- Input Features: Uses 12 cumulative monthly returns as input
- Neural Network: CNN-LSTM architecture with MSE loss
- Weighting: Equal-weighted per month (EWPM)
- Target Variable: Normalized excess returns (RetNorm)
- Ensemble: Averages 30 model fits for robust forecasts
Project Structure
charting-by-machines/
├── src/cbm/
│ ├── core/ # Configuration, types, main engine
│ ├── data/ # Data adapters, feature engineering
│ ├── ml/ # Neural network models, training
│ ├── portfolio/ # Portfolio construction, analysis
│ ├── api/ # CLI, Python API
│ └── utils/ # Logging, metrics, helpers
├── tests/ # Unit and integration tests
├── config/ # Hydra configuration files
├── examples/ # Jupyter notebooks
└── docs/ # Documentation
Citation
If you use this package in your research, please cite:
@article{murray2024charting,
title={Charting by Machines: Machine Learning-Based Portfolio Selection from Historical Price Patterns},
author={Murray, Scott and Xia, Yusen and Xiao, Houping},
journal={Journal of Financial Economics},
year={2024}
}
License
MIT License - see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file charting_by_machines-0.2.0.tar.gz.
File metadata
- Download URL: charting_by_machines-0.2.0.tar.gz
- Upload date:
- Size: 44.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0bf462272d3ab5c831662eea000bef91826265e9661c2fbdb7910b8a00e0597b
|
|
| MD5 |
5ab1d5235bd4815f0cd225cfcd132d87
|
|
| BLAKE2b-256 |
2a57aa01b62c663838b216eea8fdcec5f1b71373d32553754e626161d9bbe7ea
|
File details
Details for the file charting_by_machines-0.2.0-py3-none-any.whl.
File metadata
- Download URL: charting_by_machines-0.2.0-py3-none-any.whl
- Upload date:
- Size: 58.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ba293b9a542a8f24a1ba185c073d285d66dedf72ff00b95268e34559742c1c0f
|
|
| MD5 |
0e99c58ef0b2aec79918c694546129a4
|
|
| BLAKE2b-256 |
975a3cc4ad2b537804098f0dd11405e321ea1b8b31e40b4c58c27b48dece1bdb
|