Skip to main content

A Python implementation of Jerome Friedman's Multivariate Adaptive Regression Splines.

Project description

py-earth [![Build Status](https://travis-ci.org/scikit-learn-contrib/py-earth.png?branch=master)](https://travis-ci.org/scikit-learn-contrib/py-earth?branch=master)
========

A Python implementation of Jerome Friedman's Multivariate Adaptive Regression Splines algorithm,
in the style of scikit-learn. The py-earth package implements Multivariate Adaptive Regression Splines using Cython and provides an interface that is compatible with scikit-learn's Estimator, Predictor, Transformer, and Model interfaces. For more information about
Multivariate Adaptive Regression Splines, see the references below.

## Now With Missing Data Support!

The py-earth package now supports missingness in its predictors. Just set `allow_missing=True` when constructing an `Earth` object.

## Requesting Feedback

If there are other features or improvements you'd like to see in py-earth, please send me an email or open or comment on an issue. In particular, please let me know if any of the following are important to you:

1. Improved speed
2. Exporting models to additional formats
3. Support for shared memory multiprocessing during fitting
4. Support for cyclic predictors (such as time of day)
5. Better support for categorical predictors
6. Better support for large data sets
7. Iterative reweighting during fitting

## Installation

Make sure you have numpy and scikit-learn installed. Then do the following:

```
git clone git://github.com/scikit-learn-contrib/py-earth.git
cd py-earth
sudo python setup.py install
```

## Usage
```python
import numpy
from pyearth import Earth
from matplotlib import pyplot

#Create some fake data
numpy.random.seed(0)
m = 1000
n = 10
X = 80*numpy.random.uniform(size=(m,n)) - 40
y = numpy.abs(X[:,6] - 4.0) + 1*numpy.random.normal(size=m)

#Fit an Earth model
model = Earth()
model.fit(X,y)

#Print the model
print(model.trace())
print(model.summary())

#Plot the model
y_hat = model.predict(X)
pyplot.figure()
pyplot.plot(X[:,6],y,'r.')
pyplot.plot(X[:,6],y_hat,'b.')
pyplot.xlabel('x_6')
pyplot.ylabel('y')
pyplot.title('Simple Earth Example')
pyplot.show()
```

## Other Implementations

I am aware of the following implementations of Multivariate Adaptive Regression Splines:

1. The R package earth (coded in C by Stephen Millborrow): http://cran.r-project.org/web/packages/earth/index.html
2. The R package mda (coded in Fortran by Trevor Hastie and Robert Tibshirani): http://cran.r-project.org/web/packages/mda/index.html
3. The Orange data mining library for Python (uses the C code from 1): http://orange.biolab.si/
4. The xtal package (uses Fortran code written in 1991 by Jerome Friedman): http://www.ece.umn.edu/users/cherkass/ee4389/xtalpackage.html
5. MARSplines by StatSoft: http://www.statsoft.com/textbook/multivariate-adaptive-regression-splines/
6. MARS by Salford Systems (also uses Friedman's code): http://www.salford-systems.com/products/mars
7. ARESLab (written in Matlab by Gints Jekabsons): http://www.cs.rtu.lv/jekabsons/regression.html

The R package earth was most useful to me in understanding the algorithm, particularly because of Stephen Milborrow's
thorough and easy to read vignette (http://www.milbo.org/doc/earth-notes.pdf).

## References

1. Friedman, J. (1991). Multivariate adaptive regression splines. The annals of statistics,
19(1), 1–67. http://www.jstor.org/stable/10.2307/2241837
2. Stephen Milborrow. Derived from mda:mars by Trevor Hastie and Rob Tibshirani.
(2012). earth: Multivariate Adaptive Regression Spline Models. R package
version 3.2-3. http://CRAN.R-project.org/package=earth
3. Friedman, J. (1993). Fast MARS. Stanford University Department of Statistics, Technical Report No 110.
https://statistics.stanford.edu/sites/default/files/LCS%20110.pdf
4. Friedman, J. (1991). Estimating functions of mixed ordinal and categorical variables using adaptive splines.
Stanford University Department of Statistics, Technical Report No 108.
http://media.salford-systems.com/library/MARS_V2_JHF_LCS-108.pdf
5. Stewart, G.W. Matrix Algorithms, Volume 1: Basic Decompositions. (1998). Society for Industrial and Applied
Mathematics.
6. Bjorck, A. Numerical Methods for Least Squares Problems. (1996). Society for Industrial and Applied
Mathematics.
7. Hastie, T., Tibshirani, R., & Friedman, J. The Elements of Statistical Learning (2nd Edition). (2009).
Springer Series in Statistics
8. Golub, G., & Van Loan, C. Matrix Computations (3rd Edition). (1996). Johns Hopkins University Press.

References 7, 2, 1, 3, and 4 contain discussions likely to be useful to users of py-earth. References 1, 2, 6, 5,
8, 3, and 4 were useful during the implementation process.






Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sklearn-contrib-py-earth-0.1.0.tar.gz (1.0 MB view details)

Uploaded Source

Built Distributions

sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.6m Windows x86-64

sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win32.whl (1.6 MB view details)

Uploaded CPython 3.6m Windows x86

sklearn_contrib_py_earth-0.1.0-cp36-cp36m-macosx_10_7_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.6m macOS 10.7+ x86-64

sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.5m Windows x86-64

sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win32.whl (1.6 MB view details)

Uploaded CPython 3.5m Windows x86

sklearn_contrib_py_earth-0.1.0-cp35-cp35m-macosx_10_7_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.5m macOS 10.7+ x86-64

sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win_amd64.whl (1.7 MB view details)

Uploaded CPython 2.7m Windows x86-64

sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win32.whl (1.6 MB view details)

Uploaded CPython 2.7m Windows x86

sklearn_contrib_py_earth-0.1.0-cp27-cp27m-macosx_10_7_x86_64.whl (1.9 MB view details)

Uploaded CPython 2.7m macOS 10.7+ x86-64

File details

Details for the file sklearn-contrib-py-earth-0.1.0.tar.gz.

File metadata

File hashes

Hashes for sklearn-contrib-py-earth-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3d0f1efa5f5508610500deec0fe1084716acce1d0fc4fc81d48c52791ce7ba0c
MD5 d2096ed078db87b13d31684965e052f1
BLAKE2b-256 f8c453a24835bafac880036446cc13839471a025b41de1436543f30d15d846c1

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 2d56eaf8fab09390dbcc98e4c88b9e277de89c041b68b5211b1f40af76a569d8
MD5 016f48377e6e576f9467529d6ea57413
BLAKE2b-256 4089ac892d8b1cbee7a03b5e01a6dd88251108435fe820227084fee9bb2b0f18

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win32.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 4c8676ba36574a079645a6d02f096cafcd98b569ef63c84b954048082281271a
MD5 dbebd261763bf3e05cc3472e2fa473f5
BLAKE2b-256 240bf408a3c0763388f6cea4de087136458bcf83615a5012e1ceb8884810581f

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp36-cp36m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp36-cp36m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 dab98cde502092562c3ae0d1cb19e1af4658ce77d2ebffe908c9a44c1a56eddb
MD5 0ffb7d6b35e2fa801b6f5e7d11e8c334
BLAKE2b-256 46e7fa6d528e952028b650f3f032b038e2f7b186ff4af6e3a5d7b9d75cd358da

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win_amd64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 a1477fde2223cd000154c73bae82f5aaedef0d7191a1e16fbfa8bc7a7ce6147b
MD5 dac69df00e591bf160b7aa6824368aa0
BLAKE2b-256 d89f2decb9d969d9069e9c5bc1b41e9f6d4615e5bba927e70bd73d6cc49dd404

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win32.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 67d329f3104a1fbe968ed463f6ccc77c72eab15ff36e8ef49098d644c6c25a3c
MD5 c86b8d90cf3ac78ff963497100d8ea92
BLAKE2b-256 74de6764a77acb7ab36b0e4f4f36cb4691d187d1ef3c668489fe6be18e58bd24

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp35-cp35m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp35-cp35m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 477b3ada2aaed25cfd59c6a4a244c9f37708d503ee80acc65150941cb7b00f6f
MD5 a7e59b6ce1bde0c1230cd69548e561bc
BLAKE2b-256 3228d54c12f42c97521650d168b0d444e8991509ccdd20144a82ee196bcddc3d

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win_amd64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win_amd64.whl
Algorithm Hash digest
SHA256 f2c8fc6b6edf0b97056db6c77237b7c65ed7a8e7bbbd7c4656065c3a1f46a88e
MD5 8bb6eba0e27df913cd6a0ce24dcfb732
BLAKE2b-256 c31a879614e14f7e73af839aed0182f82aeba826c1e3411b51d8708feeae2069

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win32.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp27-cp27m-win32.whl
Algorithm Hash digest
SHA256 1666018d8ba706fb31fe70338eca017f9afacce113abc28739256b0073d39705
MD5 8c4c559b96d9479d096a61d0af419a9b
BLAKE2b-256 9921ae93ed73100913fb1e8f9c837c83ec70f5e9250d72813ded191ae3fb6148

See more details on using hashes here.

File details

Details for the file sklearn_contrib_py_earth-0.1.0-cp27-cp27m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for sklearn_contrib_py_earth-0.1.0-cp27-cp27m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 986caff5ccae7cbf5c04b46ec443756674797df598314dd4417a1dc64001c05d
MD5 05f013a1b9a375711caddeae488e9aff
BLAKE2b-256 85d3b9b539855059cf2f14f423b8d197769994c3023543b994f82f3f1590cd48

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page