High performance Python GLMs with all the features!
Project description
glum
Generalized linear models (GLM) are a core statistical tool that include many common methods like least-squares regression, Poisson regression and logistic regression as special cases. At QuantCo, we have used GLMs in e-commerce pricing, insurance claims prediction and more. We have developed glum
, a fast Python-first GLM library. The development was based on a fork of scikit-learn, so it has a scikit-learn-like API. We are thankful for the starting point provided by Christian Lorentzen in that PR!
The goal of glum
is to be at least as feature-complete as existing GLM libraries like glmnet
or h2o
. It supports
- Built-in cross validation for optimal regularization, efficiently exploiting a “regularization path”
- L1 regularization, which produces sparse and easily interpretable solutions
- L2 regularization, including variable matrix-valued (Tikhonov) penalties, which are useful in modeling correlated effects
- Elastic net regularization
- Normal, Poisson, logistic, gamma, and Tweedie distributions, plus varied and customizable link functions
- Box constraints, linear inequality constraints, sample weights, offsets
This repo also includes tools for benchmarking GLM implementations in the glum_benchmarks
module. For details on the benchmarking, see here. Although the performance of glum
relative to glmnet
and h2o
depends on the specific problem, we find that when N >> K (there are more observations than predictors), it is consistently much faster for a wide range of problems.
For more information on glum
, including tutorials and API reference, please see the documentation.
Why did we choose the name glum
? We wanted a name that had the letters GLM and wasn't easily confused with any existing implementation. And we thought glum sounded like a funny name (and not glum at all!). If you need a more professional sounding name, feel free to pronounce it as G-L-um. Or maybe it stands for "Generalized linear... ummm... modeling?"
A classic example predicting housing prices
>>> from sklearn.datasets import fetch_openml
>>> from glum import GeneralizedLinearRegressor
>>>
>>> # This dataset contains house sale prices for King County, which includes
>>> # Seattle. It includes homes sold between May 2014 and May 2015.
>>> house_data = fetch_openml(name="house_sales", version=3, as_frame=True)
>>>
>>> # Use only select features
>>> X = house_data.data[
... [
... "bedrooms",
... "bathrooms",
... "sqft_living",
... "floors",
... "waterfront",
... "view",
... "condition",
... "grade",
... "yr_built",
... "yr_renovated",
... ]
... ].copy()
>>>
>>>
>>> # Model whether a house had an above or below median price via a Binomial
>>> # distribution. We'll be doing L1-regularized logistic regression.
>>> price = house_data.target
>>> y = (price < price.median()).values.astype(int)
>>> model = GeneralizedLinearRegressor(
... family='binomial',
... l1_ratio=1.0,
... alpha=0.001
... )
>>>
>>> _ = model.fit(X=X, y=y)
>>>
>>> # .report_diagnostics shows details about the steps taken by the iterative solver
>>> diags = model.get_formatted_diagnostics(full_report=True)
>>> diags[['objective_fct']]
objective_fct
n_iter
0 0.693091
1 0.489500
2 0.449585
3 0.443681
4 0.443498
5 0.443497
Installation
Please install the package through conda-forge:
conda install glum -c conda-forge
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Hashes for glum-2.1.1-cp39-cp39-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93584bb939f86ac9d832f6bb4b7585ac86cd05f1adb9e6f3419cbc223aa2dcbd |
|
MD5 | 3a0c8095ab471568492434ce48f1cf55 |
|
BLAKE2b-256 | 600fd08888ea4bfda086fa22f290f0d304b67e2df761a5af96b835e4a49fa0d8 |
Hashes for glum-2.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | c324a82ce422459ee8b7100f9fb5cc96d23c8a77dd6b267c4954469b9c5592ae |
|
MD5 | cdecd63ff35201e1a74876fb7d7304d3 |
|
BLAKE2b-256 | 83bb9fcd99876b05fb7e27f7c2118318915d652681c418146e6d35e0565b0544 |
Hashes for glum-2.1.1-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 93f437bbb9361db5d3b520a9ea833324ae6b6b5565e48790b2e018c77e4606c9 |
|
MD5 | 4fd769a3454418ce959b0d6187b2d116 |
|
BLAKE2b-256 | be50ff234434a28b6e088fe96dba4d0d4c670ad2951e490a3630ce3614e26c8c |
Hashes for glum-2.1.1-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e765490e5f5ad31a27228a7cb96a5d3c0e82f17e737f5cc4427da1c463f60477 |
|
MD5 | 9b0d85b8a1887d3721d9f18d24a46f94 |
|
BLAKE2b-256 | 6396e25bef3aaf934cfb488ee706ec7a9d9cf41b9f43eb8a57ef6466dd1c1e8c |
Hashes for glum-2.1.1-cp38-cp38-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9198b84d1699134c6ef1ea446f59207d05db049ff109364ffabe7d93063ee2b3 |
|
MD5 | 6b9f5255887a0d94a0a17ae9c8cb1152 |
|
BLAKE2b-256 | 2f7e123ec0bcde375a32c4a31121db0194854d6cd6c3c6ab4a3e9d13b46761ae |
Hashes for glum-2.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ee299e50f1fa9341f4e2b741165d02449233d02a4176f8fdb3c9a9e5f604fc1 |
|
MD5 | 504f9909e6da73f91616319e2e7db690 |
|
BLAKE2b-256 | 9f97c05fffbe8ce9b41b579ea3a36bc69f4c0de77ad7c4c1dad6866f88b70988 |
Hashes for glum-2.1.1-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2007137da6f1ff11c5279a8413ba86b0e90d50867d8a3d06ec6379ea32e8b57 |
|
MD5 | e7e9bb6656be10233f44f2fd3489ca67 |
|
BLAKE2b-256 | abe2448a84be49dbac8936e93a128be3e060fe8c8b63531fdb04bfc7eeabdfc8 |
Hashes for glum-2.1.1-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec741fc795deffd7add419957cc3089698d11e9cd181a20036b6981afe1692f2 |
|
MD5 | 2de77eb2cd03e801ab5558b2e61db203 |
|
BLAKE2b-256 | c5fe4190394b70a160e69f6fe6e30c3f01a45daa0eaf3433b7b9bfb354365eec |
Hashes for glum-2.1.1-cp37-cp37m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 85fc8f6349d3c71dc04ea172d144d3a0f847db21cca514022c3b099fe7e253b2 |
|
MD5 | 5dd3ea084ff38ca3d7e2498d176aad7b |
|
BLAKE2b-256 | ef50e5d040ad19ee505856979944e34de573bd8fee070cf0186aa4ebf5beb8f0 |
Hashes for glum-2.1.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 440e7fade320efeda1657fd7770988b9c0a14f0e358e01b47f88a46022c2a616 |
|
MD5 | f5e976016d98447ccfa55d9e529f9b86 |
|
BLAKE2b-256 | f7951a03d7eac4b95d15af3c3453a77ad57f13377805cbebd71f36a02b8e6699 |
Hashes for glum-2.1.1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a6f15a83da9c94ef46818e292abd478459b512900466ee909b4f7a64aa184fd |
|
MD5 | a964fe2d6fcb9e50588ac57047feb5fd |
|
BLAKE2b-256 | bae23adc29f2cfe1a43e7b84fec296b2dd5b87e270a790340bbe748f8d3b1049 |
Hashes for glum-2.1.1-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1140c9ae3e2d1fec9ad95acb6cb11de90f8c8ba60147eaff76874f7e4f09c6d8 |
|
MD5 | 71d19b348e7f480dda6f481a4713db1a |
|
BLAKE2b-256 | 44df5ce784639a1437e9d25a11bc6ce54172d8375f4970890b240ac803c4b2ac |
Hashes for glum-2.1.1-cp36-cp36m-win_amd64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fac5c6650ee86adab0d03056b1c12a7283097eac0fdafdec21c262f5b974661b |
|
MD5 | 2a454a31c71e541c2523cf65ae47cd85 |
|
BLAKE2b-256 | ae6c18df0b616f60be0ecbd08a26948edaa1daf0155e29c5376ab023a3bbe01d |
Hashes for glum-2.1.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f835d8b34d1296d09171f2475fdc98d6785538e74e1487183ae74d441a5e1000 |
|
MD5 | 7848ef143e06e9a0fd62679a86e2a57f |
|
BLAKE2b-256 | c048b118afd3dc695a11bbc3c6fda0169533deb77a2c9339e21d52a721432f60 |
Hashes for glum-2.1.1-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 65d70e2eee659f815fe1896918f24d8f5fac117359fcdf133cf094182a557963 |
|
MD5 | 87d147b1939ca51d153e689d628dec23 |
|
BLAKE2b-256 | c1c3ac96a7ea72a6bcbc90904fd7d8ef86ebde606cdf8bb0ef81944f2e1ab6fb |
Hashes for glum-2.1.1-cp36-cp36m-macosx_10_9_x86_64.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | eed380fdc1adebd422f65b38960d724c76e70cdbcd35f05d4337dd8f98e7479c |
|
MD5 | 6e926ad470ba603241e5f43b843f2e70 |
|
BLAKE2b-256 | ff4e87a49713df92f16cb73efa0d4574d1112d059285b62c1d31b2e80fcc1053 |