High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-3.0.0a0-cp311-cp311-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b6c0bf3bf2fb7fbc74c3b6e435517e83f257a6df5dea3e3bc8f4a0fb0ae40ee3 |
|
MD5 | 5a3179d3daf817e939fe4389d03b22ea |
|
BLAKE2b-256 | 90eb276c989c7cd11b1b388723c9a77876f9a597d8b501a076f941a4d816f1b4 |
Hashes for glum-3.0.0a0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5a156bafeda701087cd128dc2c61bb402e944e16f4c1432be1f7730e3d6a7f2c |
|
MD5 | 1eaaf6b07cfe164431a8edb102acfa5d |
|
BLAKE2b-256 | d88ab979840b9c3f05031fc4617ca3f8cfe1f2330fc829b17a322e7add13585f |
Hashes for glum-3.0.0a0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2a6f78e38dd15116331035d9543666e326bdaab0c32d95a9dc94bddd004e3381 |
|
MD5 | 9ea018bd36de72fed71b33eb8cb7aa9b |
|
BLAKE2b-256 | b5352403a4f95776a8815e28e6a2c15b6f31493516ab2d4235ccc5d227fc3447 |
Hashes for glum-3.0.0a0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcce838bf7fbab558b268c648c02e263aa8fdd95b2d73b5d626e05fe03b34d42 |
|
MD5 | 105f6ea3b980a4ebad2c5eb513d90b51 |
|
BLAKE2b-256 | b046ae9cf971d66d17b4d60cc33f7c2acf67c6bed994badd2c1fc11a20ebd278 |
Hashes for glum-3.0.0a0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fe359d65c664e0e8af5e4682eda703cd569aebe23a5d31fd3d6330440672485c |
|
MD5 | 948cad806be17f386715f576dd04fe67 |
|
BLAKE2b-256 | 7f8f4419772201c380230b2d9c4cf969fe925dbe75c02c89077729ae240a0e04 |
Hashes for glum-3.0.0a0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f26c87c31691366193a455f5640f29e745c03477ad35597c67e465043576f24 |
|
MD5 | 2f518d80c3f67a4a00f505cba0bb10b5 |
|
BLAKE2b-256 | 8eb950fb97339994a989cfbcf7346cdc3444b4a85aa4d5a9a8aa721c818cb1ff |
Hashes for glum-3.0.0a0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f6d59bc583dc522462f53b1861f09a463ffccd483b41be89cc58d65da14a6da9 |
|
MD5 | dc632e77f84cb725dfe9f644cad32720 |
|
BLAKE2b-256 | 4b5e40488850a42d7abee88bd24762fcc69e6e3afe076de3daf1c66830f25edb |
Hashes for glum-3.0.0a0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6b3e9c2c587549888c85a078799e8f603da8c54fdf4b8c847ac4352bae4b8725 |
|
MD5 | a5121eda869f712f878e27aca6d4bd3c |
|
BLAKE2b-256 | a7c47e00b917f9df21949dbf2064ec5218a7680c21c061d8941d7fd37cf24e18 |
Hashes for glum-3.0.0a0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 447c783864a8140d61517e7b9fb41766f1d32086ec3aeee2d4db9ab15f2e5607 |
|
MD5 | 5ffcbf70036d469cc787fc7782f34ba1 |
|
BLAKE2b-256 | aa2dc32f73d21b45ad6ad50ec244fddd9d69d5b0c8a5a5d98718ae64a356137b |
Hashes for glum-3.0.0a0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d7817b385e00f45ff99277305589232628850f4d2114c439cb5729c3cf76e2c7 |
|
MD5 | 60c509f88982ad6872fedd1d76277ce7 |
|
BLAKE2b-256 | 1e005a32a63cb017d92d05482fc8bb0c7536bad6208b300e7561d068fbfe7c0f |
Hashes for glum-3.0.0a0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79eca4e01290d1ca226910a8b13d8cc1d734f332cb9f27f0fb06631b51d55b5d |
|
MD5 | a4390f23990d1bf76bd8c72aba128e79 |
|
BLAKE2b-256 | a6fd7350c15847db585f03731df0b4bd3dd191751c28ecb38ef132569f09ceb3 |
Hashes for glum-3.0.0a0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4ca7236ec31d5e4032c9796a6af1e706351fbfb6b43fb0fc95d894c901b927aa |
|
MD5 | 26274882b2ec18493d91ea14b3db02df |
|
BLAKE2b-256 | e669c8d595b5e849372cd1e3256d3ec9ccdd56cfa1f3614d39ab1b955f0f19ab |
Hashes for glum-3.0.0a0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b80b5523d2885d4eaf429d2f7ed9beab3e87ea02d2c143248d2278b5088b0b26 |
|
MD5 | c5d2c309353045511162f20898f854ed |
|
BLAKE2b-256 | 3214253083e32e11dc9e2a11863a0d75d46373d79bdf239ea7d535fdc973b139 |
Hashes for glum-3.0.0a0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0186db5349483b64f32bc99bc7e4a001e432daa2e1d65eb202ff1659cd9f815c |
|
MD5 | e6b0b339614be4e60324bf84e30a485a |
|
BLAKE2b-256 | f9272fcf3128135c9e82d56a2fc2912eb6f6469037142565374658423c5400b1 |
Hashes for glum-3.0.0a0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2098ed3ec8b5cb1d60ba8530b424529ec2e7c504bdde2f6f54fa28c8348c6f3d |
|
MD5 | 4c4bf0f8f4ebf501063825f6a114ab14 |
|
BLAKE2b-256 | 66442f62e46fc7476a27268bbb3958cab28f099c28dc7d655d4717d0d0a99488 |
Hashes for glum-3.0.0a0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 348e74e3d6d6993466cc4d106a9b73e218666263919a9eaa25b673e1ffc4f6be |
|
MD5 | 35c8968da8f74eba24e42b058f167b51 |
|
BLAKE2b-256 | a74a444caed13216f6514c8da4285663095bb8d03a1b901b2d35add5fe27e999 |
Hashes for glum-3.0.0a0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2926c0ebd17413845cf1ca550c6db3de0f4b4d256a2f7acce1ca13424d9c4272 |
|
MD5 | 32fe4ae3135bb325f8204e8aa2f432e6 |
|
BLAKE2b-256 | c6b539f22eb7de49befa1f15d648675e490dbd6270ec65f3e9c5953dadf9f111 |
Hashes for glum-3.0.0a0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a26fec8fcf93c13b5ca1948ac2a9603216a78d531aedf6e2c0e4f7cc99f29b7c |
|
MD5 | 4b56598581dd409b1118cc0a0877fa06 |
|
BLAKE2b-256 | cf68a751787047e297065df75f0dbe64e38443ec9e7f63e4044901cd07a8f4aa |
Hashes for glum-3.0.0a0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5fc22d43dea290310cc3ab201d2259a9d3040d67383367e5cb1ad3a556009499 |
|
MD5 | b1ed4547c9aba3e1cfceb951e67609c4 |
|
BLAKE2b-256 | ca9c95e5a824c49a189402acbbd109fb743766a371dcab20b54d9b6e4fb8ab3b |
Hashes for glum-3.0.0a0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a57eb2a038616b177194b9bdb5bc7c29432473ca5f2343bc22f6852d1f0d00bb |
|
MD5 | 9892bc5df42f883929528d8a83f6012a |
|
BLAKE2b-256 | b701112cee276f536365ae235495c03dedd63f88ac7795d7a81d66faf0c9acf9 |
Hashes for glum-3.0.0a0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | cf2c0f64eedd792876ff1ce5e2ade5a1b6d1f15fb55486598d45840db0c451a8 |
|
MD5 | 588958d520bd33fd7fb6d925af5d8f92 |
|
BLAKE2b-256 | a4d72b044662aa73789748ee3b87cc6f8bf93b8568628cf500596903bb78c805 |
Hashes for glum-3.0.0a0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 611c10c5cc83a995d26fc2111be7a89936d1abc7c4f75514e0e22699ab2a0012 |
|
MD5 | 30872968f346f7f4f2e8cdbc4ae0fef0 |
|
BLAKE2b-256 | bcbcbca0607e3fa5cc8c758fcc090ddbd3f9ffbda5f8ac47b70832f97740c2e9 |