Tools for portfolio construction experiments using datasets and models.
Project description
paper-portfolio: Portfolio Construction & Performance Evaluation 📈
paper-portfolio is the final component of the P.A.P.E.R (Platform for Asset Pricing Experimentation and Research) monorepo. It provides a powerful, configuration-driven framework for constructing investment portfolios based on the predictive outputs of machine learning models and evaluating their economic significance.
Using the prediction files generated by paper-model, this package allows you to backtest various long-short portfolio strategies, calculate standard performance metrics, and generate insightful reports and visualizations.
✨ Features
- Flexible Portfolio Construction:
- Long-Short Strategies: Easily construct long-short portfolios by sorting assets based on their predicted returns each month.
- Quantile-Based Selection: Define portfolio legs using specific quantile ranges (e.g., long top 10%, short bottom 10%).
- Weighting Schemes: Supports both
equal(equally-weighted) andvalue(e.g., market-cap weighted) portfolio construction.
- Comprehensive Performance Evaluation:
- Standard Metrics: Calculates
annualized_sharpe_ratio,expected_shortfall(CVaR), and trackscumulative_return. - Benchmarking: Automatically compares strategies against the risk-free rate and an optional, user-provided market index benchmark.
- Standard Metrics: Calculates
- In-Depth Analysis:
- Cross-Sectional Analysis: Optionally generates plots showing the cumulative performance of assets sorted into deciles by prediction. This is crucial for checking if the model's predictions are monotonically related to returns.
- Configuration-Driven Workflow:
- Define all portfolio strategies, the models to test, benchmarks, and metrics to calculate in a single, human-readable
portfolio-config.yamlfile. This ensures reproducibility and simplifies experimentation.
- Define all portfolio strategies, the models to test, benchmarks, and metrics to calculate in a single, human-readable
- Automated Reporting & Visualization:
- Generates detailed summary reports in text files for each model-strategy combination.
- Automatically creates and saves PNG plots of cumulative returns, providing a clear visual comparison of the long, short, and combined portfolios against benchmarks.
- Saves detailed monthly portfolio returns to Parquet files for deeper, custom analysis.
- Seamless Integration:
- Directly consumes the
.parquetprediction files produced bypaper-model. - Orchestrated by the
paper-toolsCLI for a smooth, end-to-end research pipeline.
- Directly consumes the
🚀 Installation
paper-portfolio is designed to be part of the larger PAPER monorepo.
Recommended (as part of paper-tools):
This method ensures paper-portfolio is available to the main paper CLI orchestrator.
# Using pip
pip install "paper-tools[portfolio]"
# Using uv
uv pip install "paper-tools[portfolio]"
Standalone Installation:
If you only need paper-portfolio and its core functionalities for a different project.
# Using pip
pip install paper-portfolio
# Using uv
uv pip install paper-portfolio
From Source (for development within the monorepo):
Navigate to the root of your PAPER monorepo and install paper-portfolio in editable mode.
# Using pip
pip install -e ./paper-portfolio
# Using uv
uv pip install -e ./paper-portfolio
📖 Usage Workflow
The paper-portfolio pipeline is the final step in the P.A.P.E.R workflow.
1. Prerequisites: Data and Model Pipelines
Before running the portfolio phase, you must first run the data and model pipelines to generate the necessary inputs.
# Assuming you are in your project directory (e.g., ThesisExample)
# 1. Run the data phase
paper execute data
# 2. Run the models phase
paper execute models
After these steps, your project's models/predictions/ directory should contain files like OLS_model_predictions.parquet.
2. Portfolio Configuration (portfolio-config.yaml)
Create or edit the portfolio-config.yaml file in your project's configs directory. This file defines which models to test and which portfolio strategies to apply.
# MyProjectExample/configs/portfolio-config.yaml
input_data:
# List of model names whose predictions you want to evaluate.
# These must match the names from models-config.yaml.
prediction_model_names:
- "OLS_model"
- "GBRT_tuned"
# The base name of the processed dataset used by the models.
processed_dataset_name: "processed_panel_data"
# Column names required for calculations.
date_column: "date"
id_column: "permno"
risk_free_rate_col: "rf"
value_weight_col: "marketcap" # For value-weighting
# Optional: Define a market index for benchmark comparison.
# The CSV file must be placed in the `portfolios/indexes/` directory.
market_benchmark:
name: "Market Index"
file_name: "market_index.csv"
date_column: "caldt"
return_column: "vwretd"
date_format: "%Y%m%d"
# A list of portfolio strategies to backtest for each model.
strategies:
- name: "Decile_Sort_Equal_Weighted"
weighting_scheme: "equal"
long_quantiles: [0.9, 1.0] # Long the top 10%
short_quantiles: [0.0, 0.1] # Short the bottom 10%
- name: "Decile_Sort_Value_Weighted"
weighting_scheme: "value"
long_quantiles: [0.9, 1.0]
short_quantiles: [0.0, 0.1]
# A list of performance metrics to calculate and report.
metrics:
- "sharpe_ratio"
- "expected_shortfall"
- "cumulative_return"
# Enable the generation of cross-sectional decile return plots.
cross_sectional_analysis: true
3. Running the Portfolio Pipeline
Execute the portfolio phase using the paper-tools CLI from your project directory.
# Assuming you are in your project directory (e.g., MyProjectExample)
paper execute portfolio
4. Expected Output
Console Output:
The console will show a high-level success message.
>>> Executing Portfolio Phase <<<
Portfolio phase completed successfully. Additional information in 'MyProjectExample/logs.log'
ThesisExample/portfolios/results/ Directory:
The results directory will be populated with detailed reports and plots for each model-strategy combination.
├── cross_sectional_analysis/
│ ├── GBRT_tuned_cross_sectional_returns.png
│ └── OLS_model_cross_sectional_returns.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_cumulative_return.png
├── GBRT_tuned_Decile_Sort_Equal_Weighted_monthly_returns.parquet
├── GBRT_tuned_Decile_Sort_Equal_Weighted_report.txt
├── GBRT_tuned_Decile_Sort_Value_Weighted_cumulative_return.png
├── ... (and so on for all models and strategies)
Example Report (OLS_model_Decile_Sort_Value_Weighted_report.txt):
--- Portfolio Performance Report ---
Model: OLS_model
Strategy: Decile_Sort_Value_Weighted
------------------------------
sharpe_ratio: 1.2543
expected_shortfall: -0.0312
final_cumulative_return: 8.1234
------------------------------
⚙️ Configuration Reference
The portfolio-config.yaml file controls the entire portfolio evaluation process.
input_data
prediction_model_names(list, required): A list of model names. The manager will look for prediction files named{model_name}_predictions.parquet.processed_dataset_name(string, required): The base name of the processed dataset used for modeling. This is needed to fetch columns like the risk-free rate and value-weighting characteristic.date_column,id_column,risk_free_rate_col,value_weight_col(string, optional): Names of key columns.
market_benchmark (optional)
name(string, required): Display name for the benchmark.file_name(string, required): The name of the CSV file in theportfolios/indexes/directory.date_column,return_column,date_format(string, required): Column names and date format for the benchmark file.
strategies
A list of portfolio strategies to backtest. Each strategy requires:
name(string, required): A unique name for the strategy (e.g.,"Value_Weighted_Decile").weighting_scheme(string, required): Must be either"equal"or"value".long_quantiles(list of two floats, required): The lower and upper quantile boundaries for the long leg (e.g.,[0.9, 1.0]for the top 10%).short_quantiles(list of two floats, required): The lower and upper quantile boundaries for the short leg (e.g.,[0.0, 0.1]for the bottom 10%).
metrics
A list of performance metrics to compute. Supported values: "sharpe_ratio", "expected_shortfall", "cumulative_return".
cross_sectional_analysis (optional)
- Set to
trueto enable the generation of decile-sorted performance plots for each model. Defaults tofalse.
🤝 Contributing
Contributions to paper-portfolio are highly welcome! If you have ideas for new performance metrics, portfolio construction techniques, or reporting features, please feel free to open an issue or submit a pull request.
📄 License
paper-portfolio is distributed under the MIT License. See the LICENSE file for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file paper_portfolio-0.1.0-py3-none-any.whl.
File metadata
- Download URL: paper_portfolio-0.1.0-py3-none-any.whl
- Upload date:
- Size: 16.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6496985d69570f9b3d745f0c32d5fe55303b52449b8fda19c1cee25959f326de
|
|
| MD5 |
1e2237d8924a54c7a09fb15e907dc647
|
|
| BLAKE2b-256 |
29da5cc0d598c699e95e3205975d022a81aaaa88af677c44507949c9cd525b62
|