Skip to main content

A portfolio construction and analysis tool for asset-pricing strategies.

Project description

paper-portfolio: Portfolio Construction & Performance Evaluation 📈

codecov PyPI version Python 3.11+ Ruff License: MIT

paper-portfolio is the final component of the P.A.P.E.R (Platform for Asset Pricing Experimentation and Research) monorepo. It provides a powerful, configuration-driven framework for constructing investment portfolios based on the predictive outputs of machine learning models and evaluating their economic significance.

Using the prediction files generated by paper-model, this package allows you to backtest various long-short portfolio strategies, calculate standard performance metrics, and generate insightful reports and visualizations.


✨ Features

  • Flexible Portfolio Construction:
    • Long-Short Strategies: Easily construct long-short portfolios by sorting assets based on their predicted returns each month.
    • Quantile-Based Selection: Define portfolio legs using specific quantile ranges (e.g., long top 10%, short bottom 10%).
    • Weighting Schemes: Supports both equal (equally-weighted) and value (e.g., market-cap weighted) portfolio construction.
  • Comprehensive Performance Evaluation:
    • Standard Metrics: Calculates annualized_sharpe_ratio, expected_shortfall (CVaR), and tracks cumulative_return.
    • Benchmarking: Automatically compares strategies against the risk-free rate and an optional, user-provided market index benchmark.
  • In-Depth Analysis:
    • Cross-Sectional Analysis: Optionally generates plots showing the cumulative performance of assets sorted into deciles by prediction. This is crucial for checking if the model's predictions are monotonically related to returns.
  • Configuration-Driven Workflow:
    • Define all portfolio strategies, the models to test, benchmarks, and metrics to calculate in a single, human-readable portfolio-config.yaml file. This ensures reproducibility and simplifies experimentation.
  • Automated Reporting & Visualization:
    • Generates detailed summary reports in text files for each model-strategy combination.
    • Automatically creates and saves PNG plots of cumulative returns, providing a clear visual comparison of the long, short, and combined portfolios against benchmarks.
    • Saves detailed monthly portfolio returns to Parquet files for deeper, custom analysis.
  • Seamless Integration:
    • Directly consumes the .parquet prediction files produced by paper-model.
    • Orchestrated by the paper-asset-pricing CLI for a smooth, end-to-end research pipeline.

🚀 Installation

paper-portfolio is designed to be part of the larger PAPER monorepo.

Recommended (as part of paper-asset-pricing):

This method ensures paper-portfolio is available to the main paper CLI orchestrator.

# Using pip
pip install "paper-asset-pricing[portfolio]"

# Using uv
uv pip install "paper-asset-pricing[portfolio]"

Standalone Installation:

If you only need paper-portfolio and its core functionalities for a different project.

# Using pip
pip install paper-portfolio

# Using uv
uv pip install paper-portfolio

From Source (for development within the monorepo):

Navigate to the root of your PAPER monorepo and install paper-portfolio in editable mode.

# Using pip
pip install -e ./paper-portfolio

# Using uv
uv pip install -e ./paper-portfolio

📖 Usage Workflow

The paper-portfolio pipeline is the final step in the P.A.P.E.R workflow.

1. Prerequisites: Data and Model Pipelines

Before running the portfolio phase, you must first run the data and model pipelines to generate the necessary inputs.

# Assuming you are in your project directory (e.g., ThesisExample)

# 1. Run the data phase
paper execute data

# 2. Run the models phase
paper execute models

After these steps, your project's models/predictions/ directory should contain files like OLS_model_predictions.parquet.

2. Portfolio Configuration (portfolio-config.yaml)

Create or edit the portfolio-config.yaml file in your project's configs directory. This file defines which models to test and which portfolio strategies to apply.

# MyProjectExample/configs/portfolio-config.yaml

input_data:
  # List of model names whose predictions you want to evaluate.
  # These must match the names from models-config.yaml.
  prediction_model_names:
    - "OLS_model"
    - "GBRT_tuned"

  # The base name of the processed dataset used by the models.
  processed_dataset_name: "processed_panel_data"

  # Column names required for calculations.
  date_column: "date"
  id_column: "permno"
  risk_free_rate_col: "rf"
  value_weight_col: "marketcap" # For value-weighting

# Optional: Define a market index for benchmark comparison.
# The CSV file must be placed in the `portfolios/indexes/` directory.
market_benchmark:
  name: "Market Index"
  file_name: "market_index.csv"
  date_column: "caldt"
  return_column: "vwretd"
  date_format: "%Y%m%d"

# A list of portfolio strategies to backtest for each model.
strategies:
  - name: "Decile_Sort_Equal_Weighted"
    weighting_scheme: "equal"
    long_quantiles: [0.9, 1.0]   # Long the top 10%
    short_quantiles: [0.0, 0.1]  # Short the bottom 10%

  - name: "Decile_Sort_Value_Weighted"
    weighting_scheme: "value"
    long_quantiles: [0.9, 1.0]
    short_quantiles: [0.0, 0.1]

# A list of performance metrics to calculate and report.
metrics:
  - "sharpe_ratio"
  - "expected_shortfall"
  - "cumulative_return"

# Enable the generation of cross-sectional decile return plots.
cross_sectional_analysis: true

3. Running the Portfolio Pipeline

Execute the portfolio phase using the paper-asset-pricing CLI from your project directory.

# Assuming you are in your project directory (e.g., MyProjectExample)
paper execute portfolio

4. Expected Output

Console Output:

The console will show a high-level success message.

>>> Executing Portfolio Phase <<<
Portfolio phase completed successfully. Additional information in 'MyProjectExample/logs.log'

ThesisExample/portfolios/results/ Directory:

The results directory will be populated with detailed reports and plots for each model-strategy combination.

├── cross_sectional_analysis/
│   ├── GBRT_tuned_cross_sectional_returns.png
│   └── OLS_model_cross_sectional_returns.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_cumulative_return.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_monthly_returns.parquet
├── GBRT_tuned_Decile_Sort_Equal_Weighted_report.txt
├── GBRT_tuned_Decile_Sort_Value_Weighted_cumulative_return.png
├── ... (and so on for all models and strategies)

Example Report (OLS_model_Decile_Sort_Value_Weighted_report.txt):

--- Portfolio Performance Report ---
Model: OLS_model
Strategy: Decile_Sort_Value_Weighted
------------------------------
sharpe_ratio: 1.2543
expected_shortfall: -0.0312
final_cumulative_return: 8.1234
------------------------------

⚙️ Configuration Reference

The portfolio-config.yaml file controls the entire portfolio evaluation process.

input_data

  • prediction_model_names (list, required): A list of model names. The manager will look for prediction files named {model_name}_predictions.parquet.
  • processed_dataset_name (string, required): The base name of the processed dataset used for modeling. This is needed to fetch columns like the risk-free rate and value-weighting characteristic.
  • date_column, id_column, risk_free_rate_col, value_weight_col (string, optional): Names of key columns.

market_benchmark (optional)

  • name (string, required): Display name for the benchmark.
  • file_name (string, required): The name of the CSV file in the portfolios/indexes/ directory.
  • date_column, return_column, date_format (string, required): Column names and date format for the benchmark file.

strategies

A list of portfolio strategies to backtest. Each strategy requires:

  • name (string, required): A unique name for the strategy (e.g., "Value_Weighted_Decile").
  • weighting_scheme (string, required): Must be either "equal" or "value".
  • long_quantiles (list of two floats, required): The lower and upper quantile boundaries for the long leg (e.g., [0.9, 1.0] for the top 10%).
  • short_quantiles (list of two floats, required): The lower and upper quantile boundaries for the short leg (e.g., [0.0, 0.1] for the bottom 10%).

metrics

A list of performance metrics to compute. Supported values: "sharpe_ratio", "expected_shortfall", "cumulative_return".

cross_sectional_analysis (optional)

  • Set to true to enable the generation of decile-sorted performance plots for each model. Defaults to false.

🤝 Contributing

Contributions to paper-portfolio are highly welcome! If you have ideas for new performance metrics, portfolio construction techniques, or reporting features, please feel free to open an issue or submit a pull request.


📄 License

paper-portfolio is distributed under the MIT License. See the LICENSE file for more information.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

paper_portfolio-0.1.1.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

paper_portfolio-0.1.1-py3-none-any.whl (17.1 kB view details)

Uploaded Python 3

File details

Details for the file paper_portfolio-0.1.1.tar.gz.

File metadata

  • Download URL: paper_portfolio-0.1.1.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for paper_portfolio-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9d197862634887878dc5da893f387c85d5b0106f202ba53743f3cf2151a7bef3
MD5 e0e635bbf0f4d63dcea8e561ddc9af90
BLAKE2b-256 513d9d9e8b4b9e3f6dadf9483dd45864d4c7ec8e9a2c8f41b8615589b73d8eb8

See more details on using hashes here.

File details

Details for the file paper_portfolio-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for paper_portfolio-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 3629d1530348e7d9501988d85b8aa2834fa33f11b6aea27c8d7550eef1dc2778
MD5 a42c5eccb32df288969c9062dbe0eb01
BLAKE2b-256 1cb0aa65c13838db2643a5712132a3f95b8ab9cfbe03a72780c36fa264ac53a0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page