Skip to main content

Regression inference for Python

Project description

regression-inference

PyPI version License

https://pypi.org/project/regression-inference/

pip install regression-inference

Python packaged designed for statistical inference in machine learning, econometrics, and research, with support for hardware accelerated training.


Features

  • Linear Regression, Logistic Regression, Multinomial Logistic Regression, and Ordinal Logistic Regression model fitting.

  • Automatic inferential statistics for model predictions, standard errors, t/z statistic, significance, confidence ranges.

  • Support for accelerated model training using CUDA.

  • Modular regression tables.

  • Less overhead than other econometrics and machine learning packages.


Dependencies

numpy and scipy are required as dependencies in regression-inference

numpy>=2.0.0
scipy>=1.15.0

Using pandas is optional but recommended.


Hardware Acceleration with CUDA

CUDA acceleration can be used for model training on supported hardware.

CuPy is a required dependency for hardware acceleration, it is not installed with regression-inference by default. Additionally, the CUDA Toolkit is a required dependency unless cupy is installed from conda-forge.

cupy-cuda13x=13.6.0

Fit models with the parameter cuda = True to enable hardware GPU acceleration.

model = OrdinalLogisticRegression().fit(X, y, cuda=True)
model = MultinomialLogisticRegression().fit(X, y, cuda=True)

Documentation / How To

See the provided notebooks on GitHub for example workflows.

/tests/notebooks/linear_regression_example.ipynb

/tests/notebooks/logit_regression_example.ipynb

/tests/notebooks/multinomial_regression_example.ipynb

/tests/notebooks/ordinal_regression_example.ipynb

Import Libraries

from regression_inference import LinearRegression, LogisticRegression, MultinomialLogisticRegression, OrdinalLogisticRegression, summary

Ordinary Least Squares Regression Output

==================================================
OLS Regression Results
--------------------------------------------------
Dependent:                     educ     robust_edu
--------------------------------------------------
 
const                     7.3256***      7.3256***
                           (0.3684)       (0.4345)
 
paeduc                    0.2144***      0.2144***
                           (0.0241)       (0.0236)
 
maeduc                    0.2569***      0.2569***
                           (0.0271)       (0.0294)
 
age                       0.0241***      0.0241***
                           (0.0043)       (0.0042)

--------------------------------------------------
R-squared                     0.276          0.276
Adjusted R-squared            0.274          0.274
F Statistic                 177.548        177.548
Observations               1402.000       1402.000
Log Likelihood            -3359.107      -3359.107
AIC                        6726.213       6726.213
BIC                        6747.196       6747.196
TSS                       13663.270      13663.270
RSS                        9893.727       9893.727
ESS                        3769.543       3769.543
MSE                           7.077          7.077
==================================================
*p<0.1; **p<0.05; ***p<0.01

Logistic Regression Summary

===================================
Logistic Regression Results
-----------------------------------
Dependent:                    GRADE
-----------------------------------
 
const                    -13.0213**
                           (5.1976)
 
GPA                        2.8261**
                           (1.2675)
 
TUCE                         0.0952
                           (0.1179)
 
PSI                        2.3787**
                           (0.9644)

-----------------------------------
Pseudo R-squared              0.374
LR Statistic                 15.404
Observations                 32.000
Log Likelihood              -12.890
Deviance                     25.779
Null Deviance                41.183
AIC                          33.779
BIC                          39.642
===================================
*p<0.1; **p<0.05; ***p<0.01

Multinomial Logit Summary

=============================================
Multinomial Regression Results
---------------------------------------------
Dependent:                                PID
---------------------------------------------
Class:                                      1

const                                 -0.3734
                                     (0.5943)
 
logpopul                              -0.0115
                                     (0.0341)
 
selfLR                              0.2977***
                                     (0.0993)
 
age                                -0.0249***
                                     (0.0061)
 
educ                                   0.0825
                                     (0.0740)
 
income                                 0.0052
                                     (0.0168)
 
---------------------------------------------
Class:                                      2

const                              -2.2509***
                                     (0.7579)
 
logpopul                            -0.0888**
                                     (0.0377)
 
selfLR                              0.3917***
                                     (0.1089)
 
age                                -0.0229***
                                     (0.0084)
 
educ                                 0.1810**
                                     (0.0862)
 
income                               0.0479**
                                     (0.0234)
 
---------------------------------------------
Class:                                      3

const                              -3.6656***
                                     (1.3816)
 
logpopul                              -0.1060
                                     (0.0659)
 
selfLR                              0.5735***
                                     (0.1648)
 
age                                   -0.0149
                                     (0.0107)
 
educ                                  -0.0072
                                     (0.1234)
 
income                                 0.0576
                                     (0.0390)
 
---------------------------------------------
Class:                                      4

const                              -7.6138***
                                     (1.0433)
 
logpopul                            -0.0916**
                                     (0.0452)
 
selfLR                              1.2788***
                                     (0.1382)
 
age                                   -0.0087
                                     (0.0086)
 
educ                                 0.1998**
                                     (0.0966)
 
income                              0.0845***
                                     (0.0262)
 
---------------------------------------------
Class:                                      5

const                              -7.0605***
                                     (0.8462)
 
logpopul                            -0.0933**
                                     (0.0399)
 
selfLR                              1.3470***
                                     (0.1252)
 
age                                 -0.0179**
                                     (0.0078)
 
educ                                0.2169***
                                     (0.0816)
 
income                              0.0810***
                                     (0.0219)
 
---------------------------------------------
Class:                                      6

const                             -12.1058***
                                     (1.2198)
 
logpopul                           -0.1409***
                                     (0.0427)
 
selfLR                              2.0701***
                                     (0.1747)
 
age                                   -0.0094
                                     (0.0084)
 
educ                                0.3219***
                                     (0.0879)
 
income                              0.1089***
                                     (0.0260)
 
---------------------------------------------
Accuracy                                0.394
Pseudo R-squared                        0.165
LR Statistic                          576.848
Observations                          944.000
Log Likelihood                      -1461.923
Null Log Likelihood                 -1750.347
Deviance                             2923.845
Null Deviance                        3500.693
AIC                                  2995.845
BIC                                  3170.450
=============================================
*p<0.1; **p<0.05; ***p<0.01

Ordinal Regression Summary

=============================================
Ordinal Regression Results
---------------------------------------------
Dependent:                                PID
---------------------------------------------
 
logpopul                           -0.0707***
                                     (0.0191)
 
selfLR                              1.0192***
                                     (0.0533)
 
age                                   -0.0042
                                     (0.0037)
 
educ                                0.1777***
                                     (0.0408)
 
income                              0.0472***
                                     (0.0108)
 
0:1                                 3.6891***
                                     (0.3729)
 
1:2                                 0.2243***
                                     (0.0686)
 
2:3                                -0.3445***
                                     (0.0910)
 
3:4                                -1.3565***
                                     (0.1607)
 
4:5                                -0.4247***
                                     (0.0979)
 
5:6                                  0.1653**
                                     (0.0752)

---------------------------------------------
Accuracy                                0.388
Pseudo R-squared                        0.146
LR Statistic                          511.454
Observations                          944.000
Log Likelihood                      -1494.620
Null Log Likelihood                 -1750.347
Deviance                             2989.239
Null Deviance                        3500.693
AIC                                  3011.239
BIC                                  3064.590
=============================================
*p<0.1; **p<0.05; ***p<0.01

Coefficient Inference Table

Inference tables can be generated for the model features.

pd.DataFrame(model.inference_table())


Predictions

Extract the order of feature names using feature_names[:1]

model.feature_names[1:]
[Out]: Index(['paeduc', 'maeduc', 'age'], dtype='object')

Predict in the order of the features without a constant.

model.predict(np.array([[0, 0, 0], ]))
[Out]: array([7.32564767])

Inference

Use return_table = True to include inference statistics.

# Range over values of a feature

prediction_set = [
    (np.array([[i, X['maeduc'].mean(), X['age'].mean()],]))
    for i in range(int(X['paeduc'].min()), int(X['paeduc'].max())+1)
    ] 
    
predictions = pd.concat([pd.DataFrame(model.predict(i, return_table=True)) for i in prediction_set], ignore_index=True)

# Predict discrete values

prediction_set = [
    np.array([[2.66, 20.0, 0.0]]),
    np.array([[2.89, 22.0, 0.0]]),
    np.array([[3.28, 24.0, 0.0]]),
    np.array([[2.92, 12.0, 0.0]]),
]

predictions = pd.concat([pd.DataFrame(model.predict(test_set, return_table=True)) for test_set in prediction_set], ignore_index=True)


Variance Inflation Factor

Variance Inflation Factor table can be generated for the model features.

model.variance_inflation_factor()
{'feature': Index(['paeduc', 'maeduc', 'age'], dtype='object'),
 'VIF': array([2.0233, 2.0285, 1.0971])}

Heteroskedastic-Robust Standard Errors

Set the covariance matrix on fit using cov_type:

model = MultinomialLogisticRegression().fit(X, y, cov_type="HC0")

model = LogisticRegression().fit(X, y, cov_type="HC1")

model = LinearRegression().fit(X, y, cov_type="HC3")

Preview robust covariance without setting:

model.robust_se(type="HC3")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

regression_inference-1.3.4.tar.gz (1.8 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

regression_inference-1.3.4-py3-none-any.whl (34.4 kB view details)

Uploaded Python 3

File details

Details for the file regression_inference-1.3.4.tar.gz.

File metadata

  • Download URL: regression_inference-1.3.4.tar.gz
  • Upload date:
  • Size: 1.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.9

File hashes

Hashes for regression_inference-1.3.4.tar.gz
Algorithm Hash digest
SHA256 0c8251ba7d9b58a42f67bbc582995b31e56f2d5734a0afb2622c5d55d68d64c3
MD5 3bc739e538bc75e3727ee8ed26989b4d
BLAKE2b-256 2ba48122da86f4652ff2f7c61b4060265b247e8982f68f6b74e0a9cfddbe7b9f

See more details on using hashes here.

File details

Details for the file regression_inference-1.3.4-py3-none-any.whl.

File metadata

File hashes

Hashes for regression_inference-1.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 67390b6bbe6886f1a8a3dc463f091e2599f881a7ca69107b7ffacb9f731af353
MD5 30ea9c31520d95dc95cdb61a4ab4e7de
BLAKE2b-256 86450a733b903ccb9cc523c70e52fe0b1497964df67e2287500f495c081dc175

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page