Skip to main content

Double Machine Learning in Python

Project description

DoubleML - Double Machine Learning in Python

The Python package DoubleML provides an implementation of the double / debiased machine learning framework of Chernozhukov et al. (2018). It is built on top of scikit-learn (Pedregosa et al., 2011).

Note that the Python package was developed together with an R twin based on mlr3. The R package is also available on GitHub.

Documentation and maintenance

Documentation and website: http://docs.doubleml.org/

DoubleML is currently maintained by @MalteKurz and @PhilippBach.

Bugs can be reported to the issue tracker at https://github.com/DoubleML/doubleml-for-py/issues.

Main Features

Double / debiased machine learning (Chernozhukov et al. (2018)) for

  • Partially linear regression models (PLR)
  • Partially linear IV regression models (PLIV)
  • Interactive regression models (IRM)
  • Interactive IV regression models (IIVM)

The object-oriented implementation of DoubleML is very flexible. The model classes DoubleMLPLR, DoubleMLPLIV, DoubleMLIRM and DoubleIIVM implement the estimation of the nuisance functions via machine learning methods and the computation of the Neyman orthogonal score function. All other functionalities are implemented in the abstract base class DoubleML. In particular functionalities to estimate double machine learning models and to perform statistical inference via the methods fit, bootstrap, confint, p_adjust and tune. This object-oriented implementation allows a high flexibility for the model specification in terms of ...

  • ... the machine learners for the nuisance functions,
  • ... the resampling schemes,
  • ... the double machine learning algorithm,
  • ... the Neyman orthogonal score functions,
  • ...

It further can be readily extended with regards to

  • ... new model classes that come with Neyman orthogonal score functions being linear in the target parameter,
  • ... alternative score functions via callables,
  • ... alternative resampling schemes,
  • ...

An overview of the OOP structure of the DoubleML package is given in the graphic available at https://github.com/DoubleML/doubleml-for-py/blob/master/doc/oop.svg

Installation

DoubleML requires

  • Python
  • sklearn
  • numpy
  • scipy
  • pandas
  • statsmodels
  • joblib

We plan to push a first release of the DoubleML package to pip and conda very soon.

Until then we recommend to install from source via

git clone git@github.com:DoubleML/doubleml-for-py.git
cd doubleml-for-py
pip install --editable .

Citation

If you use the DoubleML package a citation is highly appreciated:

Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M. (2020), DoubleML - Double Machine Learning in Python. URL: https://github.com/DoubleML/doubleml-for-py, Python-Package version 0.1.0.

Bibtex-entry:

@Manual{DoubleML2020,
  title = {DoubleML - Double Machine Learning in Python},
  author = {Bach, P., Chernozhukov, V., Kurz, M. S., and Spindler, M.},
  year = {2020},
  note = {URL: \url{https://github.com/DoubleML/doubleml-for-py}, Python-Package version 0.1.0}
}

References

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W. and Robins, J. (2018), Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21: C1-C68. doi:10.1111/ectj.12097.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M. and Duchesnay, E. (2011), Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825--2830, https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DoubleML-0.1.0.tar.gz (47.4 kB view hashes)

Uploaded Source

Built Distribution

DoubleML-0.1.0-py3-none-any.whl (89.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page