Skip to main content

ML-based portfolio selection from historical price patterns (Murray, Xia, Xiao 2024)

Project description

Charting by Machines

Tests Coverage Python License PyPI

A Python package reproducing the ML-based portfolio selection methodology from "Charting by Machines" by Murray, Xia, and Xiao (2024, Journal of Financial Economics).

Overview

This package implements a machine learning approach to test the efficient market hypothesis by forecasting stock returns from historical price patterns. The methodology uses CNN-LSTM neural networks to generate return forecasts that strongly predict the cross-section of future stock returns.

Key Features

  • Multiple ML Architectures: FNN, CNN, LSTM, CNN-LSTM
  • Flexible Data Sources: Yahoo Finance, WRDS (CRSP), local files
  • Portfolio Construction: Univariate, bivariate, and trivariate quintile sorting
  • Risk Analysis: Factor models (CAPM, FF3, FF5, Carhart), Sharpe ratios
  • Experiment Tracking: MLflow integration for reproducibility
  • Multiple Interfaces: Python API, CLI, Jupyter notebooks

Installation

# Using pip
pip install charting-by-machines

# Using poetry (recommended for development)
git clone https://github.com/yourusername/charting-by-machines.git
cd charting-by-machines
poetry install

Quick Start

from cbm import PortfolioEngine

# Initialize the engine
engine = PortfolioEngine()

# Load data (using Yahoo Finance)
engine.load_data(
    tickers=["AAPL", "MSFT", "GOOGL", ...],  # or use universe="sp500"
    start_date="2010-01-01",
    end_date="2023-12-31"
)

# Train the CNN-LSTM model
model_id = engine.train_model(
    architecture="cnn_lstm",
    loss_function="mse",
    weighting="ewpm",
    optimization_period=("2010-01", "2018-12")
)

# Generate forecasts
forecasts = engine.forecast(model_id=model_id)

# Construct portfolios sorted by ML forecasts
portfolios = engine.construct_portfolios(
    forecasts=forecasts,
    n_portfolios=10,
    weighting="value"
)

# Analyze performance
performance = engine.analyze_performance(portfolios)
print(performance.summary())

CLI Usage

# Train a model
cbm train --config config/default.yaml

# Generate forecasts
cbm forecast --model-id <model_id> --output forecasts.parquet

# Run backtest
cbm backtest --config config/backtest.yaml

Methodology

Based on Murray, Xia, and Xiao (2024), the package:

  1. Input Features: Uses 12 cumulative monthly returns as input
  2. Neural Network: CNN-LSTM architecture with MSE loss
  3. Weighting: Equal-weighted per month (EWPM)
  4. Target Variable: Normalized excess returns (RetNorm)
  5. Ensemble: Averages 30 model fits for robust forecasts

Project Structure

charting-by-machines/
├── src/cbm/
│   ├── core/           # Configuration, types, main engine
│   ├── data/           # Data adapters, feature engineering
│   ├── ml/             # Neural network models, training
│   ├── portfolio/      # Portfolio construction, analysis
│   ├── api/            # CLI, Python API
│   └── utils/          # Logging, metrics, helpers
├── tests/              # Unit and integration tests
├── config/             # Hydra configuration files
├── examples/           # Jupyter notebooks
└── docs/               # Documentation

Citation

If you use this package in your research, please cite:

@article{murray2024charting,
  title={Charting by Machines: Machine Learning-Based Portfolio Selection from Historical Price Patterns},
  author={Murray, Scott and Xia, Yusen and Xiao, Houping},
  journal={Journal of Financial Economics},
  year={2024}
}

License

MIT License - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

charting_by_machines-0.2.0.tar.gz (44.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

charting_by_machines-0.2.0-py3-none-any.whl (58.9 kB view details)

Uploaded Python 3

File details

Details for the file charting_by_machines-0.2.0.tar.gz.

File metadata

  • Download URL: charting_by_machines-0.2.0.tar.gz
  • Upload date:
  • Size: 44.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for charting_by_machines-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0bf462272d3ab5c831662eea000bef91826265e9661c2fbdb7910b8a00e0597b
MD5 5ab1d5235bd4815f0cd225cfcd132d87
BLAKE2b-256 2a57aa01b62c663838b216eea8fdcec5f1b71373d32553754e626161d9bbe7ea

See more details on using hashes here.

File details

Details for the file charting_by_machines-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for charting_by_machines-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ba293b9a542a8f24a1ba185c073d285d66dedf72ff00b95268e34559742c1c0f
MD5 0e99c58ef0b2aec79918c694546129a4
BLAKE2b-256 975a3cc4ad2b537804098f0dd11405e321ea1b8b31e40b4c58c27b48dece1bdb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page