High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.4.0-cp310-cp310-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a9117fb077e1328a62cc6404f4c24732a134f087efdebcd818c8cb750fcd1446 |
|
MD5 | 6ab06afc77a6b3e266de4bb55dc1a22f |
|
BLAKE2b-256 | d5ceaf66704cbce4866238062da6dbf55dddef03759047f0d2cad64093c506b2 |
Hashes for glum-2.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 956d8a9edb60b9ddd7753d7a194afe94fb3a2d50ad8a20ea034da5c235a157b7 |
|
MD5 | 975b05dab04563e5412e57e5c488ebde |
|
BLAKE2b-256 | 84ed55dcc60a1fdf05d3a4258d33a097c4a043332da43db774b5db1322c573a1 |
Hashes for glum-2.4.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 753c66ddfb6c605bb3b5545db488245e639ef31a0c9f87748de6d644b27951e3 |
|
MD5 | c670d9ca1ec6250e8e0675faf8c14b07 |
|
BLAKE2b-256 | 9cd09fa94bff7cf6510c1ddfba9ed7d175c15675feaa10a3ffb7fc31437ed951 |
Hashes for glum-2.4.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b19fd6aa0825827fdec42bdeebc1e2adf83c903c34c9a1bd4f978930c9d97a4 |
|
MD5 | 292120893dacec30f2c0ce37b7137f7d |
|
BLAKE2b-256 | c35448f8c8f4b54e8cd0b24cd2725df40053d8be66cf7269a0602d8ceace8586 |
Hashes for glum-2.4.0-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9aead80f5e79cb5bd40ae3898e1cc6acd70f609c600549af89efedda8f55a894 |
|
MD5 | 91ef1e6ea5bfe591586ff7f46c42b358 |
|
BLAKE2b-256 | 826315e870194b993e9d5fb0561e139ea6b419805fe9cd68d5057a5b16e3b619 |
Hashes for glum-2.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6e7e4e4676318d8de2322d5c9b07d699522a42675432cdc272df44288c6ec031 |
|
MD5 | 7f20f0453d74a0328b1362c7a1654b09 |
|
BLAKE2b-256 | be2b709572e2c41732ccd4e69c3c7c627d9c667c5407e517c3148d5fb6da9b59 |
Hashes for glum-2.4.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2f57dec022d3583eb3c019bdf2f2a4d4676c227ecca62e2d76fded81d364a8f8 |
|
MD5 | 54305e2178e9c0bf1d94e0892bfc42bf |
|
BLAKE2b-256 | a540c3bc3acb61149d4e9c295959dff3a806e59e6c0b30cda17c9dbb5b7a87c7 |
Hashes for glum-2.4.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b538f72e7aa30c878d8f343287f7fc11ec35d58e7a55d0f36f410e6e78af3963 |
|
MD5 | 550ffe3883f12005cc9401a3c544e0a6 |
|
BLAKE2b-256 | bb061057ccd0cfd9ce04c679af06988e03ac446daba2679c715c3af88df43c4f |
Hashes for glum-2.4.0-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 16a4418976a6563e0060395e03eee7284206aa45bd99435d76028f1d5ddcde7b |
|
MD5 | ed224eb18b3935d17930a41e0e85ff5a |
|
BLAKE2b-256 | 67daffa6d7ef9625d10cff6bab00309903a52905ed2f528f34d6fd3acef0aadb |
Hashes for glum-2.4.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8637e627f5af7fc8d3846d5fc2ce15e3d2ad3fba5e2c5a17f3afac266dd8a1ab |
|
MD5 | 76f52e664252d1f79d98a42b0539b6f8 |
|
BLAKE2b-256 | 04ba08ba21957f64d1eda634ab1c9fb14e7f8d164e911c2360aac2bd3fd6cb33 |
Hashes for glum-2.4.0-cp38-cp38-macosx_11_0_arm64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e750b068b67338c7bbfa40cd6bccec259f1917d512eb745f1aaa99da0f9d275 |
|
MD5 | 90ff9501b242dc34d0634f42c8896434 |
|
BLAKE2b-256 | d5d31c6cf1f2827a665d97814f1d62247c8a0afc36a77f3008957ad4b79d4940 |
Hashes for glum-2.4.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0a25387dc8619ff67f5ead1bdeec1d957e36450b1ee342341e8c58e6e254126a |
|
MD5 | 35812b633c1bd69ba58dab574d89699f |
|
BLAKE2b-256 | 70f4709ed76408add84a40af860201434cd406e17555fd1b3a2879a9e4813cf5 |
Hashes for glum-2.4.0-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c8a75cd5e95378e6f47a7754f842fcea08d25a94e062b9cff11cc84846a3299 |
|
MD5 | ca4183ce16fa94f905777d334da346c5 |
|
BLAKE2b-256 | 559a0526a3b3712fd7f3a1cc1bdf7b673c46e66902e307f4ceb3a389209490d9 |
Hashes for glum-2.4.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b290f0f32cbd4bcf191f6c5df6fb7a6241ab2f25a7cd351e09c4038db28ff7d8 |
|
MD5 | fa39a4f5482dc6acd99a903933fc7aa5 |
|
BLAKE2b-256 | bb7793797aa0e26cc4b74c8330b44fa5868242f6c412a00ad842a1899525c388 |
Hashes for glum-2.4.0-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1b2ac5658d70b48e2fb1c674988ac14a8acd065d8c66594ed53df8da759507c |
|
MD5 | 2148f818ea7be3305907fab7d560e5fe |
|
BLAKE2b-256 | d60b9d386eba864cb695f9131073897a292692b977f9903d104a865b293013d6 |
Hashes for glum-2.4.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | df62f544859dc9e96b7fa2697bc0e2fcfb34db212a747fcb8ec069906498816c |
|
MD5 | c472563b503013f66a2c47dade173294 |
|
BLAKE2b-256 | a4bc8ba7acc30cfe3b533ce276c822472c9bea307b331cf4f123f08a8b26fff0 |
Hashes for glum-2.4.0-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 114ec9b2fd428e33fccb423d0fa3729cc02d8f1e2c0085e011581ea85c4e2a65 |
|
MD5 | 914216cc676c2482361825c329edfc4f |
|
BLAKE2b-256 | cbdfcaa3bba2879d80974f69a97c606a6379f138a127d0f69d0b03521108814a |
Hashes for glum-2.4.0-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f0e89e6f8e6cf55863eb1b2635ce5cf7b62546c46552fec0c5aaed21d7a35986 |
|
MD5 | 373e018d2a5ce44140be3622b3ca3df2 |
|
BLAKE2b-256 | b137990a7efb6c091a3ec5adbd8af7e90a7395c0ee826d92a23cc58197df3888 |
Hashes for glum-2.4.0-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9723afdc3ab7afcb2824fc62acc3142b411df8898fa96e6918e050370d9e98e |
|
MD5 | b3cd32fb562fe0c6ffbdaf947d06c778 |
|
BLAKE2b-256 | cded67d44aad3d640f36d66d9cc6864aa95296f91690fb8b0abdbffe6fc3902f |
Hashes for glum-2.4.0-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c1eb2826ec61348b208badcea58cd5c768a56d1692e7e8677ae1730e4cbaf03 |
|
MD5 | 37553b118b62f6b41d0acd1148565dd8 |
|
BLAKE2b-256 | 4f0029ebedd012b0ab7a53bec3c109712f09feafbb7c8810dd5155d153d47a41 |