Download datasets for various multifactor asset pricing models.
Project description
getfactormodels
Reliably retrieve data for various multi-factor asset pricing models.
Models
- The 3-factor, 5-factor, and 6-factor models of Fama & French [1] [3] [4]
- Mark Carhart's 4-factor model [2]
- Pastor and Stambaugh's liquidity factors [5]
- Mispricing factors of Stambaugh and Yuan[6]
- The $q$-factor model of Hou, Mo, Xue and Zhang[7]
- The augmented $q^5$-factor model of Hou, Mo, Xue and Zhang[8]
- Intermediary Capital Ratio (ICR) of He, Kelly & Manela[9]
- The DHS behavioural factors of Daniel, Hirshleifer & Sun[10]
- The HML $^{DEVIL}$ factor of Asness & Frazzini[11]
- The 6-factor model of Barillas and Shanken[12]
Thanks to: Kenneth French, Robert Stambaugh, Lin Sun, Zhiguo He, AQR Capital Management (AQR.com) and Hou, Xue and Zhang (global-q.org), for their research and for the datasets they provide.
Installation
[!IMPORTANT]
getfactormodelsis pre-alpha (until version 0.1.0), don't rely on it for anything. Requires:
- Python:
>=3.10
pip install -U getfactormodels
Quick start
Basic usage:
- Import getfactormodels and use the
get_factorsfunction, with amodelparameter:
import getfactormodels
data = getfactormodels.get_factors(model='q', frequency='d')
- All other parameters are optional. By default monthly data is returned.
import getfactormodels as gfm
# Download Fama-French 3-factor daily data since start_date:
df = gfm.get_factors(model='ff3', frequency='d', start_date='2006-01-01')
# Monthly DHS factors until end_date:
dhs = gfm.get_models(model='dhs', end_date='2010-12-31'
# Mispricing factors (monthly) with date range and export:
df = gfm.get_factors(
model='mispricing',
start_date='1970-01-01',
end_date='1999-12-31',
output='mispricing_factors.csv' #.csv, .txt, .pkl
)
- You can import only the models you need. For example, the ICR and q-factor models:
from getfactormodels import ICRFactors, QFactors
# Passing a model class without params defaults to monthly data.
icr = ICRFactors() # look! no params!
# Use the download module to get the data
df = icr.download()
# The 'q' models, and the 3-factor model of Fama-French have weekly data available
df = QFactors(frequency='w',
start_date='1992-05-22',
end_date='2019-01-05').download() # chained! Wow!
#print(df.tail(3))
- All model Classes:
HMLDevil,CarhartFactors,FamaFrenchFactors,QFactors,LiquidityFactors,MispricingFactors,BarillasShankenFactors,ICRFactors,DHSFactors.
Parameters
- All model classes support:
frequency:dwmystart_date: YYYY-[MM-DD] formatend_date: YYYY-[MM-DD] formatoutput_file: Export path
FamaFrenchFactorshas amodelparam. Accepts3456or equivilent model name (ff3,famafrench3) (default:3).QFactorshas aclassicboolean param (default:false) for returing the 4-factor q-factor model (2015).
CLI
Requires: bash >=4.2
This is old but should still work until redo.
-
You can also use getfactormodels from the command line. It's very basic at the moment, here's the
--help:$ getfactormodels -h usage: getfactormodels [-h] -m MODEL [-f FREQ] [-s START] [-e END] [-o OUTPUT] [--no_rf] [--no_mkt]
-
An example of how to use the CLI to retrieve the Fama-French 3-factor model data:
$ getfactormodels --model ff3 --frequency M --start-date 1960-01-01 --end-date 2020-12-31 --output .csv
-
Here's another example that retrieves the annual 5-factor data of Fama-French, without the RF column (using
--no[_]rf)$ getfactormodels -m ff5 -f Y -s 1960-01-01 -e 2020-12-31 --norf -o ~/some_dir/filename.xlsx
-
To return the factors without the risk-free rate
RF, or the excess market returnMkt-RF, columns:$ getfactormodels -m ff5 -f Y -s 1960-01-01 -e 2020-12-31 --norf --nomkt -o ~/some_dir/filename.xlsx
Data Availability
This table shows each model's start date, available frequencies, and the latest datapoint if not current. The id column
contains the shortest identifier for each model. These should all work in python and the CLI.
id |
Factor Model | Start | D | W | M | Q | Y | End |
|---|---|---|---|---|---|---|---|---|
3 |
Fama-French 3 | 1926-07-01 | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | - | |
4 |
Carhart 4 | 1926-11-03 | $\checkmark$ | $\checkmark$ | $\checkmark$ | - | ||
5 |
Fama-French 5 | 1963-07-01 | $\checkmark$ | $\checkmark$ | $\checkmark$ | - | ||
6 |
Fama-French 6 | 1963-07-01 | $\checkmark$ | $\checkmark$ | $\checkmark$ | - | ||
hmld |
HML $^{DEVIL}$ | 1990-07-02 | $\checkmark$ | $\checkmark$ | - | |||
dhs |
DHS | 1972-07-03 | $\checkmark$ | $\checkmark$ | 2023-12-29 | |||
icr |
ICR | 1970-01-31 Daily: 1999-05-03 |
$\checkmark$ | $\checkmark$ | $\checkmark$ | 2025-06-27 | ||
mis |
Mispricing | 1963-01-02 | $\checkmark$ | $\checkmark$ | 2016-12-30 | |||
liq |
Liquidity | 1962-08-31 | $\checkmark$ | 2024-12-31 | ||||
qq4 |
$q^5$-factors $q$-factors |
1967-01-03 | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | $\checkmark$ | 2022-12-30 |
bs |
Barillas-Shanken | 1967-01-03 | $\checkmark$ | $\checkmark$ | 2024-12-31 |
[TODO] Docs!
References
Publications:
- E. F. Fama and K. R. French, ‘Common risk factors in the returns on stocks and bonds’, Journal of Financial Economics, vol. 33, no. 1, pp. 3–56, 1993. PDF
- M. Carhart, ‘On Persistence in Mutual Fund Performance’, Journal of Finance, vol. 52, no. 1, pp. 57–82, 1997. PDF
- E. F. Fama and K. R. French, ‘A five-factor asset pricing model’, Journal of Financial Economics, vol. 116, no. 1, pp. 1–22, 2015. PDF
- E. F. Fama and K. R. French, ‘Choosing factors’, Journal of Financial Economics, vol. 128, no. 2, pp. 234–252, 2018. PDF
- L. Pastor and R. Stambaugh, ‘Liquidity Risk and Expected Stock Returns’, Journal of Political Economy, vol. 111, no. 3, pp. 642–685, 2003. PDF
- R. F. Stambaugh and Y. Yuan, ‘Mispricing Factors’, The Review of Financial Studies, vol. 30, no. 4, pp. 1270–1315, 12 2016. PDF
- K. Hou, H. Mo, C. Xue, and L. Zhang, ‘Which Factors?’, National Bureau of Economic Research, Inc, 2014. PDF
- K. Hou, H. Mo, C. Xue, and L. Zhang, ‘An Augmented q-Factor Model with Expected Growth*’, Review of Finance, vol. 25, no. 1, pp. 1–41, 02 2020. PDF
- Z. He, B. Kelly, and A. Manela, ‘Intermediary asset pricing: New evidence from many asset classes’, Journal of Financial Economics, vol. 126, no. 1, pp. 1–35, 2017. PDF
- K. Daniel, D. Hirshleifer, and L. Sun, ‘Short- and Long-Horizon Behavioral Factors’, Review of Financial Studies, vol. 33, no. 4, pp. 1673–1736, 2020. PDF
- C. Asness and A. Frazzini, ‘The Devil in HML’s Details’, The Journal of Portfolio Management, vol. 39, pp. 49–68, 2013. PDF
- F. Barillas and J. Shanken, ‘Comparing Asset Pricing Models’, Journal of Finance, vol. 73, no. 2, pp. 715–754, 2018. PDF
Data sources:
- K. French, "Data Library," Tuck School of Business at Dartmouth. Link
- R. Stambaugh, "Liquidity" and "Mispricing" factor datasets, Wharton School, University of Pennsylvania. Link
- Z. He, "Intermediary Capital Ratio and Risk Factor" dataset, zhiguohe.net. Link
- K. Hou, G. Xue, R. Zhang, "The Hou-Xue-Zhang q-factors data library," at global-q.org. Link
- AQR Capital Management's Data Sets.
- Lin Sun, DHS Behavioural factors Link
License
Known issues
- The first
hml_devil_factors()retrieval is slow, because the download from aqr.com is slow. It's the only model implementing a cache—daily data expires at the end of the day, and will only re-download when the requestedend_dateexceeds the cache's latest index date. Similar for monthly, expiring at at the end of the month, and re-downloaded when next needed. Some models aren't downloading.Update: all models should be downloading.
Todo
- Refactor: a complete rewrite, implementing a better interface and design patterns, dropping dependencies.
- Docs
- Every model should have an about and author/copyright info, and general disclaimer
- This README
- Examples
- Tests
- Error handling
IGNORE THIS
There's also the FactorExtractor class (which doesn't do much yet, it's mainly used by the CLI):
from getfactormodels import FactorExtractor
fe = FactorExtractor(model='carhart', start_date='1980-01-01', end_date='1980-05-01)
fe.get_factors()
fe.drop_rf()
fe.to_file('~/carhart_factors.csv')
.drop_rf()will return the DataFrame without theRFcolumn. You can also drop theMkt-RFcolumn with.drop_mkt()
REWRITE ME
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file getfactormodels-0.0.7.tar.gz.
File metadata
- Download URL: getfactormodels-0.0.7.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.6.87.2-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6be262ca91f508c93da7acb9440aa71716e938dd103545846cdaadbe733508b8
|
|
| MD5 |
2989890324f69f6b90be9d41da4fd211
|
|
| BLAKE2b-256 |
d0af539f5809a95204be9eaad9e9090217169591ce59cb9182627d101da88c9c
|
File details
Details for the file getfactormodels-0.0.7-py3-none-any.whl.
File metadata
- Download URL: getfactormodels-0.0.7-py3-none-any.whl
- Upload date:
- Size: 49.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.6.87.2-microsoft-standard-WSL2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
012dc9ed51fa00cf9ba35b2ff5125ecf813af3295d0e4f73f82dece5fb1bbc0b
|
|
| MD5 |
3db51472cc65c0479ec280d923fea580
|
|
| BLAKE2b-256 |
89ca18e3e48d86d1158699cacb2e474f70b9e93e8990d118d37d2aba890fcee1
|