Skip to main content

Download datasets for various multifactor asset pricing models.

Project description

getfactormodels

Python PyPI - Version PyPI - Status GitHub License

Reliably retrieve data for various multi-factor asset pricing models.

Models

  • The 3-factor, 5-factor, and 6-factor models of Fama & French [1] [3] [4]
  • Mark Carhart's 4-factor model [2]
  • Pastor and Stambaugh's liquidity factors [5]
  • Mispricing factors of Stambaugh and Yuan[6]
  • The $q$-factor model of Hou, Mo, Xue and Zhang[7]
  • The augmented $q^5$-factor model of Hou, Mo, Xue and Zhang[8]
  • Intermediary Capital Ratio (ICR) of He, Kelly & Manela[9]
  • The DHS behavioural factors of Daniel, Hirshleifer & Sun[10]
  • The HML $^{DEVIL}$ factor of Asness & Frazzini[11]
  • The 6-factor model of Barillas and Shanken[12]

Thanks to: Kenneth French, Robert Stambaugh, Lin Sun, Zhiguo He, AQR Capital Management (AQR.com) and Hou, Xue and Zhang (global-q.org), for their research and for the datasets they provide.

Installation

[!IMPORTANT] getfactormodels is pre-alpha (until version 0.1.0), don't rely on it for anything.

PyPI - Status

But a huge thanks to anyone who has tried it!

Requires:

  • Python >=3.10

The easiest way to install (and update) getfactormodels is with pip:

pip install -U getfactormodels

You can also download the latest release and install using pip.

linux/macOS
 curl -LO https://github.com/x512/getfactormodels/archive/latest.tar.gz
 tar -xzf latest.tar.gz
 cd getfactormodels-*
 pip install .

Quick start

Basic usage:

  • Import getfactormodels and use the get_factors function, with a model parameter:
 import getfactormodels

 data = getfactormodels.get_factors(model='q', frequency='d')
  • All other parameters are optional. By default monthly data is returned.
# monthly Fama-French 3-factors since start_date
df = getfactormodels.get_factors(model='ff3', start_date='2006-01-01')

# Daily Mispricing factors saved to file:
df = getfactormodels.get_factors(
    model='mispricing',
    start_date='1970-01-01',
    end_date='1999-12-31',
    output='~/mispricing_factors.csv'  #.csv, .pkl, .parquet, .txt
)
  • Using the model classes, you can import only the models you want:
from getfactormodels import ICRFactors, QFactors
model = ICRFactors(frequency='m', start_date='2000-01-01')

# use the download module to get the data
df = model.download()

# use the extract module to get a factor
factor = model.extract("IC_RATIO")
  • Fama-French 3-Factors and the q-factors have weekly data available:
df = QFactors(frequency='w',
              start_date='1992-05-22',
              end_date='2019-01-05').download() # chained! Wow!
  • For more examples see the notebook: here

CLI

You can use getfactormodels from the command line. Just call getfactormodels with the --model -m flag.

  • Frequency -f defaults to monthly and all other parameters are optional:
#monthly Fama-French 3 factor model
getfactormodels --model ff3

# daily mispricing factors since start
getfactormodels -m mis --frequency d --start 2000-01-01

Note: all data is cached for 1 day, re-running commands isn't wasteful.

  • Save data to a file with --output -o:
#save annual Fama-French 5-Factors to file:
getfactormodels -m 5 - f y --output "~/dir/filename.csv" # can be csv, pkl, parquet, txt.

getfactormodels -m liq -f m -o somefile # will be a csv, will be in users current directory.

Note: Fama French models can be a string ("ff3") or int (3, 4, 5, 6, where 4 = carhart).

  • Extract a factor from a model with the --extract -x flag:
getfactormodels -m carhart -f m --extract MOM
# extract multiple factors to a file 
getfactormodels -m ff3 -f m -x SMB HML -o "dir/filename.pkl"
  • Access Fama-French Emerging and Developed/International markets using the --region -r flag:
# 3 factor model for developed markets
getfactormodels -m ff3 --region developed

# 5-Factor model for Europe saved to file 
getfactormodels -m 5 -r europe -o euro_factors

# extract the SMB and WML factors from the 4 Factor model
getfactormodels -m 4 --region emerging --extract SMB WML
  • See more in the example notebook: here

Or try it for yourself:

Open in nbviewer Open In Colab

Classes

A list of model classes available:

  • FamaFrenchFactors
  • CarhartFactors
  • QFactors
  • ICRFactors
  • DHSFactors
  • LiquidityFactors
  • MispricingFactors
  • HMLDevilFactors
  • BarillasShankenFactors

For a list of parameters, see the example notebook. (Docs are coming)

(back to top)

Data Availability

This table shows each model's start date, available frequencies, and the latest datapoint if not current. The id column contains the shortest identifier for each model. These should all work in python and the CLI.

id Factor Model Start D W M Q Y End
3 Fama-French 3 1926-07-01 -
4 Carhart 4 1926-11-03 -
5 Fama-French 5 1963-07-01 -
6 Fama-French 6 1963-07-01 -
hmld HML $^{DEVIL}$ 1990-07-02 -
dhs DHS 1972-07-03 2023-12-29
icr ICR 1970-01-31
Daily: 1999-05-03
2025-06-27
mis Mispricing 1963-01-02 2016-12-30
liq Liquidity 1962-08-31 2024-12-31
q
q4
$q^5$-factors
$q$-factors
1967-01-03 $\checkmark$ 2022-12-30
bs Barillas-Shanken 1967-01-03 2024-12-31

References

Publications:

  1. E. F. Fama and K. R. French, ‘Common risk factors in the returns on stocks and bonds’, Journal of Financial Economics, vol. 33, no. 1, pp. 3–56, 1993. PDF
  2. M. Carhart, ‘On Persistence in Mutual Fund Performance’, Journal of Finance, vol. 52, no. 1, pp. 57–82, 1997. PDF
  3. E. F. Fama and K. R. French, ‘A five-factor asset pricing model’, Journal of Financial Economics, vol. 116, no. 1, pp. 1–22, 2015. PDF
  4. E. F. Fama and K. R. French, ‘Choosing factors’, Journal of Financial Economics, vol. 128, no. 2, pp. 234–252, 2018. PDF
  5. L. Pastor and R. Stambaugh, ‘Liquidity Risk and Expected Stock Returns’, Journal of Political Economy, vol. 111, no. 3, pp. 642–685, 2003. PDF
  6. R. F. Stambaugh and Y. Yuan, ‘Mispricing Factors’, The Review of Financial Studies, vol. 30, no. 4, pp. 1270–1315, 12 2016. PDF
  7. K. Hou, H. Mo, C. Xue, and L. Zhang, ‘Which Factors?’, National Bureau of Economic Research, Inc, 2014. PDF
  8. K. Hou, H. Mo, C. Xue, and L. Zhang, ‘An Augmented q-Factor Model with Expected Growth*’, Review of Finance, vol. 25, no. 1, pp. 1–41, 02 2020. PDF
  9. Z. He, B. Kelly, and A. Manela, ‘Intermediary asset pricing: New evidence from many asset classes’, Journal of Financial Economics, vol. 126, no. 1, pp. 1–35, 2017. PDF
  10. K. Daniel, D. Hirshleifer, and L. Sun, ‘Short- and Long-Horizon Behavioral Factors’, Review of Financial Studies, vol. 33, no. 4, pp. 1673–1736, 2020. PDF
  11. C. Asness and A. Frazzini, ‘The Devil in HML’s Details’, The Journal of Portfolio Management, vol. 39, pp. 49–68, 2013. PDF
  12. F. Barillas and J. Shanken, ‘Comparing Asset Pricing Models’, Journal of Finance, vol. 73, no. 2, pp. 715–754, 2018. PDF

Data sources:

  • K. French, "Data Library," Tuck School of Business at Dartmouth. Link
  • R. Stambaugh, "Liquidity" and "Mispricing" factor datasets, Wharton School, University of Pennsylvania. Link
  • Z. He, "Intermediary Capital Ratio and Risk Factor" dataset, zhiguohe.net. Link
  • K. Hou, G. Xue, R. Zhang, "The Hou-Xue-Zhang q-factors data library," at global-q.org. Link
  • AQR Capital Management's Data Sets.
  • Lin Sun, DHS Behavioural factors Link

(back to top)

License

GitHub License

Known issues

  • The first hml_devil_factors() retrieval is slow, because the download from aqr.com is slow. It's the only model implementing a cache—daily data expires at the end of the day, and will only re-download when the requested end_date exceeds the cache's latest index date. Similar for monthly, expiring at at the end of the month, and re-downloaded when next needed.
  • Some models aren't downloading. Update: all models should be downloading.

Todo

  • Refactor: a complete rewrite, implementing a better interface and design patterns, dropping dependencies.
  • Refactor: FF models.
  • Docs
  • Every model should have an about and author/copyright info, and general disclaimer
  • This README
    • Example ipynb
  • Tests
  • Error handling!
  • Types

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

getfactormodels-0.0.10.tar.gz (45.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

getfactormodels-0.0.10-py3-none-any.whl (58.8 kB view details)

Uploaded Python 3

File details

Details for the file getfactormodels-0.0.10.tar.gz.

File metadata

  • Download URL: getfactormodels-0.0.10.tar.gz
  • Upload date:
  • Size: 45.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for getfactormodels-0.0.10.tar.gz
Algorithm Hash digest
SHA256 ec68b1cb409428dad761d211e29c4b56f2b7d8d0c6356aa57ba6c246e794912a
MD5 393a8b909ffb63553f5d8d4e53a0e773
BLAKE2b-256 b719522f5fbfed7e0f5d6f1bdd1a0ad62519135cad32322222d688772177d4f7

See more details on using hashes here.

File details

Details for the file getfactormodels-0.0.10-py3-none-any.whl.

File metadata

  • Download URL: getfactormodels-0.0.10-py3-none-any.whl
  • Upload date:
  • Size: 58.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.2.1 CPython/3.13.5 Linux/6.6.87.2-microsoft-standard-WSL2

File hashes

Hashes for getfactormodels-0.0.10-py3-none-any.whl
Algorithm Hash digest
SHA256 cf3febc1d181f39277a5874b4d7ff7a84fd6780e5e1bccb1badf48debb81f947
MD5 67d4b4df62aa10014a77431e7bdc63d0
BLAKE2b-256 5873bfc551b8bc271cc44f760f65da714e9a38ac667baf81ab9765665cb53c62

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page