Retrieve data for various multifactor asset pricing models.
Project description
getfactormodels
Reliably retrieve data for various multi-factor asset pricing models.
Models
- The 3-factor, 5-factor, and 6-factor models of Fama & French [1] [3] [4]
- Mark Carhart's 4-factor model [2]
- Pastor and Stambaugh's liquidity factors [5]
- Mispricing factors of Stambaugh and Yuan[6]
- The $q$-factor model of Hou, Mo, Xue and Zhang[7]
- The augmented $q^5$-factor model of Hou, Mo, Xue and Zhang[8]
- Intermediary Capital Ratio (ICR) of He, Kelly & Manela[9]
- The DHS behavioural factors of Daniel, Hirshleifer & Sun[10]
- The HML $^{DEVIL}$ factor of Asness & Frazzini[11]
- The 6-factor model of Barillas and Shanken[12]
Thanks to: Kenneth French, Robert Stambaugh, Lin Sun, Zhiguo He, AQR Capital Management (AQR.com) and Hou, Xue and Zhang (global-q.org), for their research and for the datasets they provide.
Installation
getfactormodels
requires Python >=3.7
-
The easiest way to install getfactormodels is via pip:
$ pip install getfactormodels
Usage
[!IMPORTANT]
getfactormodels
is new. It was released on December 20, 2023. Don't rely on it for anything.
After installation, import and call the get_factors()
function with the model
and frequency
params:
-
For example, retrieving the monthly ${q}^{5}$ factor model:
import getfactormodels data = getfactormodels.get_factors(model='q', frequency='m')
Trimmed output:
> print(data) Mkt-RF R_ME R_IA R_ROE R_EG RF date 1967-01-03 0.000778 0.004944 0.001437 -0.007118 -0.008563 0.000187 1967-01-04 0.001667 -0.003487 -0.000631 -0.002044 -0.000295 0.000187 1967-01-05 0.012990 0.004412 -0.005688 0.000838 -0.003075 0.000187 1967-01-06 0.007230 0.006669 0.008897 0.003603 0.002669 0.000187 1967-01-09 0.008439 0.006315 0.000331 0.004949 0.002979 0.000187 ... ... ... ... ... ... ... 2022-12-23 0.005113 -0.001045 0.004000 0.010484 0.003852 0.000161 2022-12-27 -0.005076 -0.001407 0.010190 0.009206 0.003908 0.000161 2022-12-28 -0.012344 -0.004354 0.000133 -0.010457 -0.004953 0.000161 2022-12-29 0.018699 0.008568 -0.008801 -0.012686 -0.002162 0.000161 2022-12-30 -0.002169 0.001840 0.001011 -0.004151 -0.003282 0.000161 [14096 rows x 6 columns]
-
Retrieving the daily data for the Fama-French 3-factor model, since
start_date
:import getfactormodels as gfm df = gfm.get_factors(model='ff3', frequency='d', start_date=`2006-01-01`)
-
Retrieving data for Stambaugh and Yuan's monthly Mispricing factors, between
start_date
andend_date
, and saving the data to a file:import getfactormodels as gfm df = gfm.get_factors(model='mispricing', start_date='1970-01-01', end_date=1999-12-31, output='mispricing_factors.csv')
output
can be a filename, directory, or path. If no extension is specified, defaults to .csv (can be one of: .xlsx, .csv, .txt, .pkl, .md)
You can import only the models that you need:
-
For example, to import only the ICR and q-factor models:
from getfactormodels import icr_factors, q_factors # Passing a model function without params defaults to monthly data. df = icr_factors() # The 'q' models, and the 3-factor model of Fama-French have weekly data available: df = q_factors(frequency="W", start_date="1992-01-01, output='.xlsx')
output
allows just a file extension (with the.
, else it'll be passed as a filename). -
When using
ff_factors()
, specify an additionalmodel
parameter (this might be changed):# To get annual data for the 5-factor model: data = ff_factors(model="5", frequency="Y", output=".xlsx") # Daily 3-factor model data, since 1970 (not specifying an end date # will return data up until today): data = ff_factors(model="3", frequency="D", start_date="1970-01-01")
There's also a FactorExtractor
class (which doesn't do much yet, it's mainly used by the CLI; lots to do):
from getfactormodels import FactorExtractor
fe = FactorExtractor(model='carhart', start_date='1980-01-01', end_date='1980-05-01)
fe.get_factors()
fe.drop_rf()
fe.to_file('~/carhart_factors.md')
-
The resulting
carhart_factors.md
file will look like this:date Mkt-RF SMB HML MOM 1980-01-31 00:00:00 0.0551 0.0162 0.0175 0.0755 1980-02-29 00:00:00 -0.0122 -0.0185 0.0061 0.0788 1980-03-31 00:00:00 -0.129 -0.0664 -0.0101 -0.0955 1980-04-30 00:00:00 0.0397 0.0105 0.0106 -0.0043
.drop_rf()
will return the DataFrame without theRF
column. You can also drop the "Mkt-RF" column with.drop_mkt()
CLI
bash >=4.2
-
You can also use getfactormodels from the command line. It's very barebones, here's the
-h
:$ getfactormodels -h usage: getfactormodels [-h] -m MODEL [-f FREQ] [-s START] [-e END] [-o OUTPUT] [--no_rf] [--no_mkt]
-
An example of how to use the CLI to retrieve the Fama-French 3-factor model data:
$ getfactormodels --model ff3 --frequency M --start-date 1960-01-01 --end-date 2020-12-31 --output ".csv"
-
Here's another example that retrieves the annual Fama-French 5-factor data without the RF column (using
--no_rf
)$ getfactormodels -m ff5 -f Y -s 1960-01-01 -e 2020-12-31 --no_rf -o ~/some_dir/filename.xlsx
-
To return the factors without the risk-free rate
RF
, or the excess market returnMkt-RF
, columns:
Data Availability
[TODO]
References
- E. F. Fama and K. R. French, ‘Common risk factors in the returns on stocks and bonds’, Journal of Financial Economics, vol. 33, no. 1, pp. 3–56, 1993. PDF
- M. Carhart, ‘On Persistence in Mutual Fund Performance’, Journal of Finance, vol. 52, no. 1, pp. 57–82, 1997. PDF
- E. F. Fama and K. R. French, ‘A five-factor asset pricing model’, Journal of Financial Economics, vol. 116, no. 1, pp. 1–22, 2015. PDF
- E. F. Fama and K. R. French, ‘Choosing factors’, Journal of Financial Economics, vol. 128, no. 2, pp. 234–252, 2018. PDF
- L. Pastor and R. Stambaugh, ‘Liquidity Risk and Expected Stock Returns’, Journal of Political Economy, vol. 111, no. 3, pp. 642–685, 2003. PDF
- R. F. Stambaugh and Y. Yuan, ‘Mispricing Factors’, The Review of Financial Studies, vol. 30, no. 4, pp. 1270–1315, 12 2016. PDF
- K. Hou, H. Mo, C. Xue, and L. Zhang, ‘Which Factors?’, National Bureau of Economic Research, Inc, 2014. PDF
- K. Hou, H. Mo, C. Xue, and L. Zhang, ‘An Augmented q-Factor Model with Expected Growth*’, Review of Finance, vol. 25, no. 1, pp. 1–41, 02 2020. PDF
- Z. He, B. Kelly, and A. Manela, ‘Intermediary asset pricing: New evidence from many asset classes’, Journal of Financial Economics, vol. 126, no. 1, pp. 1–35, 2017. PDF
- K. Daniel, D. Hirshleifer, and L. Sun, ‘Short- and Long-Horizon Behavioral Factors’, Review of Financial Studies, vol. 33, no. 4, pp. 1673–1736, 2020. PDF
- C. Asness and A. Frazzini, ‘The Devil in HML’s Details’, The Journal of Portfolio Management, vol. 39, pp. 49–68, 2013. PDF
- F. Barillas and J. Shanken, ‘Comparing Asset Pricing Models’, Journal of Finance, vol. 73, no. 2, pp. 715–754, 2018. PDF
Data sources:
- K. French, "Data Library," Tuck School of Business at Dartmouth. Link
- R. Stambaugh, "Liquidity" and "Mispricing" factor datasets, Wharton School, University of Pennsylvania. Link
- Z. He, "Intermediary Capital Ratio and Risk Factor" dataset, University of Chicago. Link
- K. Hou, G. Xue, R. Zhang, "The Hou-Xue-Zhang q-factors data library," at global-q.org. Link
- AQR Capital Management's Data Sets.
- Lin Sun, DHS Behavioural factors Link
License
The code in this project is released under the MIT License.
Known issues
- The first
hml_devil_factors()
retrieval is slow, because the download from aqr.com is slow. It's the only model, so far, implementing a cache—daily data expires at the end of the day and is only re-downloaded when the requestedend_date
exceeds the file's last index date. Similar for monthly, expiring at EOM and re-downloaded when needed.
Todo
- Docs
- Examples
- Tests
- Error handling
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for getfactormodels-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fab82d870c68a85ba783b9b19049506bde54ce62f8d2d18063ae2f5f80eb5a98 |
|
MD5 | bc4d53570f4e1ab5e6da04988780d92a |
|
BLAKE2b-256 | 9ad763c8dda96ddb734aff829f0c03264427856d83068697eb7025edcfcb8fb5 |