Financial portfolio optimisation in python
Project description
PyPortfolioOpt is a library that implements portfolio optimisation methods, including classical meanvariance optimisation techniques and BlackLitterman allocation, as well as more recent developments in the field like shrinkage and Hierarchical Risk Parity, along with some novel experimental features like exponentiallyweighted covariance matrices.
It is extensive yet easily extensible, and can be useful for both the casual investor and the serious practitioner. Whether you are a fundamentalsoriented investor who has identified a handful of undervalued picks, or an algorithmic trader who has a basket of interesting signals, PyPortfolioOpt can help you combine your alpha streams in a riskefficient way.
Head over to the documentation on ReadTheDocs to get an indepth look at the project, or check out the cookbook to see some examples showing the full process from downloading data to building a portfolio.
<center> </center>Table of contents
 Table of contents
 Getting started
 A quick example
 What's new
 An overview of classical portfolio optimisation methods
 Features
 Advantages over existing implementations
 Project principles and design decisions
 Roadmap
 Testing
 Contributing
 Getting in touch
Getting started
If you would like to play with PyPortfolioOpt interactively in your browser, you may launch Binder here. It takes a while to set up, but it lets you try out the cookbook recipes without having to deal with all of the requirements.
Note: if you are on windows, you first need to installl C++. (download, install instructions)
This project is available on PyPI, meaning that you can just:
pip install PyPortfolioOpt
However, it is best practice to use a dependency manager within a virtual environment. My current recommendation is to get yourself set up with poetry then just run
poetry add PyPortfolioOpt
Otherwise, clone/download the project and in the project directory run:
python setup.py install
Thanks to Thomas Schmelzer, PyPortfolioOpt now supports Docker (requires make, docker, dockercompose). Build your first container with make build
; run tests with make test
. For more information, please read this guide.
For development
If you would like to make major changes to integrate this with your proprietary system, it probably makes sense to clone this repository and to just use the source code.
git clone https://github.com/robertmartin8/PyPortfolioOpt
Alternatively, you could try:
pip install e git+https://github.com/robertmartin8/PyPortfolioOpt.git
A quick example
Here is an example on real life stock data, demonstrating how easy it is to find the longonly portfolio that maximises the Sharpe ratio (a measure of riskadjusted returns).
import pandas as pd from pypfopt import EfficientFrontier from pypfopt import risk_models from pypfopt import expected_returns # Read in price data df = pd.read_csv("tests/resources/stock_prices.csv", parse_dates=True, index_col="date") # Calculate expected returns and sample covariance mu = expected_returns.mean_historical_return(df) S = risk_models.sample_cov(df) # Optimise for maximal Sharpe ratio ef = EfficientFrontier(mu, S) raw_weights = ef.max_sharpe() cleaned_weights = ef.clean_weights() ef.save_weights_to_file("weights.csv") # saves to file print(cleaned_weights) ef.portfolio_performance(verbose=True)
This outputs the following weights:
{'GOOG': 0.01269, 'AAPL': 0.09202, 'FB': 0.19856, 'BABA': 0.09642, 'AMZN': 0.07158, 'GE': 0.0, 'AMD': 0.0, 'WMT': 0.0, 'BAC': 0.0, 'GM': 0.0, 'T': 0.0, 'UAA': 0.0, 'SHLD': 0.0, 'XOM': 0.0, 'RRC': 0.0, 'BBY': 0.06129, 'MA': 0.24562, 'PFE': 0.18413, 'JPM': 0.0, 'SBUX': 0.03769} Expected annual return: 33.0% Annual volatility: 21.7% Sharpe Ratio: 1.43
This is interesting but not useful in itself. However, PyPortfolioOpt provides a method which allows you to convert the above continuous weights to an actual allocation that you could buy. Just enter the most recent prices, and the desired portfolio size ($10,000 in this example):
from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices latest_prices = get_latest_prices(df) da = DiscreteAllocation(weights, latest_prices, total_portfolio_value=10000) allocation, leftover = da.lp_portfolio() print("Discrete allocation:", allocation) print("Funds remaining: ${:.2f}".format(leftover))
11 out of 20 tickers were removed Discrete allocation: {'GOOG': 0, 'AAPL': 5, 'FB': 11, 'BABA': 5, 'AMZN': 1, 'BBY': 7, 'MA': 14, 'PFE': 50, 'SBUX': 5} Funds remaining: $8.42
Disclaimer: nothing about this project constitues investment advice, and the author bears no responsibiltiy for your subsequent investment decisions. Please refer to the license for more information.
What's new
As of v1.2.0:
 Docker support
 Idzorek's method for specifying BlackLitterman views using percentage confidences.
 Industry constraints: limit your sector exposure.
 Multiple additions and improvements to
risk_models
: Introduced a new API, in which the function
risk_models.risk_matrix(method="...")
allows all the different risk models to be called. This should make testing easier.  All methods now accept returns data instead of prices, if you set the flag
returns_data=True
.
 Introduced a new API, in which the function
 Automatically fix nonpositive semidefinite covariance matrices!
 Additions and improvements to
expected_returns
: Introduced a new API, in which the function
expected_returns.return_model(method="...")
allows all the different return models to be called. This should make testing easier.  Added option to 'properly' compound returns.
 JamesStein shrinkage estimator
 CAPM return model.
 Introduced a new API, in which the function
from pypfopt import plotting
: moved all plotting functionality into a new class and added new plots. All other plotting functions (scattered in different classes) have been retained, but are now deprecated.
An overview of classical portfolio optimisation methods
Harry Markowitz's 1952 paper is the undeniable classic, which turned portfolio optimisation from an art into a science. The key insight is that by combining assets with different expected returns and volatilities, one can decide on a mathematically optimal allocation which minimises the risk for a target return – the set of all such optimal portfolios is referred to as the efficient frontier.
<center> </center>Although much development has been made in the subject, more than half a century later, Markowitz's core ideas are still fundamentally important and see daily use in many portfolio management firms. The main drawback of meanvariance optimisation is that the theoretical treatment requires knowledge of the expected returns and the future riskcharacteristics (covariance) of the assets. Obviously, if we knew the expected returns of a stock life would be much easier, but the whole game is that stock returns are notoriously hard to forecast. As a substitute, we can derive estimates of the expected return and covariance based on historical data – though we do lose the theoretical guarantees provided by Markowitz, the closer our estimates are to the real values, the better our portfolio will be.
Thus this project provides four major sets of functionality (though of course they are intimately related)
 Estimates of expected returns
 Estimates of risk (i.e covariance of asset returns)
 Objective functions to be optimised
 Optimisers.
A key design goal of PyPortfolioOpt is modularity – the user should be able to swap in their components while still making use of the framework that PyPortfolioOpt provides.
Features
In this section, we detail PyPortfolioOpt's current available functionality as per the above breakdown. More examples are offered in the Jupyter notebooks here. Another good resource is the tests.
A far more comprehensive version of this can be found on ReadTheDocs, as well as possible extensions for more advanced users.
Expected returns
 Mean historical returns:
 the simplest and most common approach, which states that the expected return of each asset is equal to the mean of its historical returns.
 easily interpretable and very intuitive
 Exponentially weighted mean historical returns:
 similar to mean historical returns, except it gives exponentially more weight to recent prices
 it is likely the case that an asset's most recent returns hold more weight than returns from 10 years ago when it comes to estimating future returns.
 JamesStein shrinkage:
 a slightly more robust estimate of future returns
 by shrinking mean returns to the grand average, we can reduce loss.
 Capital Asset Pricing Model (CAPM):
 a simple model to predict returns based on the beta to the market
 this is used all over finance!
Risk models (covariance)
The covariance matrix encodes not just the volatility of an asset, but also how it correlated to other assets. This is important because in order to reap the benefits of diversification (and thus increase return per unit risk), the assets in the portfolio should be as uncorrelated as possible.
 Sample covariance matrix:
 an unbiased estimate of the covariance matrix
 relatively easy to compute
 the de facto standard for many years
 however, it has a high estimation error, which is particularly dangerous in meanvariance optimisation because the optimiser is likely to give excess weight to these erroneous estimates.
 Semicovariance: a measure of risk that focuses on downside variation.
 Exponential covariance: an improvement over sample covariance that gives more weight to recent data
 Covariance shrinkage: techniques that involve combining the sample covariance matrix with a structured estimator, to reduce the effect of erroneous weights. PyPortfolioOpt provides wrappers around the efficient vectorised implementations provided by
sklearn.covariance
. manual shrinkage
 Ledoit Wolf shrinkage, which chooses an optimal shrinkage parameter. We offer three shrinkage targets:
constant_variance
,single_factor
, andconstant_correlation
.  Oracle Approximating Shrinkage
 Minimum Covariance Determinant:
 a robust estimate of the covariance
 implemented in
sklearn.covariance
(This plot was generated using plotting.plot_covariance
)
Objective functions
 Maximum Sharpe ratio: this results in a tangency portfolio because on a graph of returns vs risk, this portfolio corresponds to the tangent of the efficient frontier that has a yintercept equal to the riskfree rate. This is the default option because it finds the optimal return per unit risk.
 Minimum volatility. This may be useful if you're trying to get an idea of how low the volatility could be, but in practice it makes a lot more sense to me to use the portfolio that maximises the Sharpe ratio.
 Efficient return, a.k.a. the Markowitz portfolio, which minimises risk for a given target return – this was the main focus of Markowitz 1952
 Efficient risk: the Sharpemaximising portfolio for a given target risk.
 Maximum quadratic utility. You can provide your own riskaversion level and compute the appropriate portfolio.
Adding constraints or different objectives
 Long/short: by default all of the meanvariance optimisation methods in PyPortfolioOpt are longonly, but they can be initialised to allow for short positions by changing the weight bounds:
ef = EfficientFrontier(mu, S, weight_bounds=(1, 1))
 Market neutrality: for the
efficient_risk
andefficient_return
methods, PyPortfolioOpt provides an option to form a marketneutral portfolio (i.e weights sum to zero). This is not possible for the max Sharpe portfolio and the min volatility portfolio because in those cases because they are not invariant with respect to leverage. Market neutrality requires negative weights:
ef = EfficientFrontier(mu, S, weight_bounds=(1, 1)) ef.efficient_return(target_return=0.2, market_neutral=True)
 Minimum/maximum position size: it may be the case that you want no security to form more than 10% of your portfolio. This is easy to encode:
ef = EfficientFrontier(mu, S, weight_bounds=(0, 0.1))
One issue with meanvariance optimisation is that it leads to many zeroweights. While these are
"optimal" insample, there is a large body of research showing that this characteristic leads
meanvariance portfolios to underperform outofsample. To that end, I have introduced an
objective function that can reduce the number of negligible weights for any of the objective functions. Essentially, it adds a penalty (parameterised by gamma
) on small weights, with a term that looks just like L2 regularisation in machine learning. It may be necessary to try several gamma
values to achieve the desired number of nonnegligible weights. For the test portfolio of 20 securities, gamma ~ 1
is sufficient
ef = EfficientFrontier(mu, S) ef.add_objective(objective_functions.L2_reg, gamma=1) ef.max_sharpe()
BlackLitterman allocation
As of v0.5.0, we now support BlackLitterman asset allocation, which allows you to combine a prior estimate of returns (e.g the marketimplied returns) with your own views to form a posterior estimate. This results in much better estimates of expected returns than just using the mean historical return. Check out the docs for a discussion of the theory, as well as advice on formatting inputs.
S = risk_models.sample_cov(df) viewdict = {"AAPL": 0.20, "BBY": 0.30, "BAC": 0, "SBUX": 0.2, "T": 0.131321} bl = BlackLittermanModel(S, pi="equal", absolute_views=viewdict, omega="default") rets = bl.bl_returns() ef = EfficientFrontier(rets, S) ef.max_sharpe()
Other optimisers
The features above mostly pertain to solving efficient frontier optimisation problems via quadratic programming (though this is taken care of by cvxpy
). However, we offer different optimisers as well:
 Hierarchical Risk Parity, using clustering algorithms to choose uncorrelated assets
 Markowitz's critical line algorithm (CLA)
Please refer to the documentation for more.
Advantages over existing implementations
 Includes both classical methods (Markowitz 1952 and BlackLitterman), suggested best practices (e.g covariance shrinkage), along with many recent developments and novel features, like L2 regularisation, shrunk covariance, hierarchical risk parity.
 Native support for pandas dataframes: easily input your daily prices data.
 Extensive practical tests, which use reallife data.
 Easy to combine with your proprietary strategies and models.
 Robust to missing data, and priceseries of different lengths (e.g FB data only goes back to 2012 whereas AAPL data goes back to 1980).
Project principles and design decisions
 It should be easy to swap out individual components of the optimisation process with the user's proprietary improvements.
 Usability is everything: it is better to be selfexplanatory than consistent.
 There is no point in portfolio optimisation unless it can be practically applied to real asset prices.
 Everything that has been implemented should be tested.
 Inline documentation is good: dedicated (separate) documentation is better. The two are not mutually exclusive.
 Formatting should never get in the way of coding: because of this, I have deferred all formatting decisions to Black.
Roadmap
Feel free to raise an issue requesting any new features – here are some of the things I want to implement:
 Optimising for higher moments (i.e skew and kurtosis)
 Factor modelling: doable but not sure if it fits within the API.
 Proper CVaR optimisation – remove NoisyOpt and use linear programming
 More objective functions, including the Calmar Ratio, Sortino Ratio, etc.
 Monte Carlo optimisation with custom distributions
 Opensource backtests using either
Backtrader <https://www.backtrader.com/>
_ orZipline <https://github.com/quantopian/zipline>
_.  Further support for different risk/return models
Testing
Tests are written in pytest (much more intuitive than unittest
and the variants in my opinion), and I have tried to ensure close to 100% coverage. Run the tests by navigating to the package directory and simply running pytest
on the command line.
PyPortfolioOpt provides a test dataset of daily returns for 20 tickers:
['GOOG', 'AAPL', 'FB', 'BABA', 'AMZN', 'GE', 'AMD', 'WMT', 'BAC', 'GM', 'T', 'UAA', 'SHLD', 'XOM', 'RRC', 'BBY', 'MA', 'PFE', 'JPM', 'SBUX']
These tickers have been informally selected to meet several criteria:
 reasonably liquid
 different performances and volatilities
 different amounts of data to test robustness
Currently, the tests have not explored all of the edge cases and combinations of objective functions and parameters. However, each method and parameter has been tested to work as intended.
Contributing
Contributions are most welcome. Have a look at the Contribution Guide for more.
Getting in touch
If you would like to reach out for any reason, be it consulting opportunities or just for a chat, please do so via the form on my website.
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size PyPortfolioOpt1.2.3py3noneany.whl (48.3 kB)  File type Wheel  Python version py3  Upload date  Hashes View 
Filename, size PyPortfolioOpt1.2.3.tar.gz (50.2 kB)  File type Source  Python version None  Upload date  Hashes View 
Hashes for PyPortfolioOpt1.2.3py3noneany.whl
Algorithm  Hash digest  

SHA256  5386d8bed35ae35ef2094c968761289f725890b8dabc5acee0ad8f944c70ab29 

MD5  8dbd37c9ef306c9ea7e8b3a483b00fa6 

BLAKE2256  8ff54571e069d4870ede26995e191f2ef341329f881d7086cebea504141c79a8 