Diagnostic Plots for Lineare Regression Models. Similar to plot.lm in R.
Project description
lmdiag
Python Library providing Diagnostic Plots for Linear Regression Models. (Like plot.lm in R.)
I built this, because I missed the diagnostics plots of R for a university project. There are some substitutions in Python for individual charts, but they are spread over different libraries and sometimes don't show the exact same. My implementation tries to copycat the R-plots, but I didn't reimplement the R-code: The charts are just based on available documentation.
Installation
pip install lmdiag
Usage
lmdiag generates plots for fitted linear regression models from
statsmodels
,
linearmodels
and
scikit-learn
.
You can find some usage examples in this jupyter notebook.
Example
import numpy as np
import statsmodels.api as sm
import lmdiag
# Fit model with random sample data
np.random.seed(20)
X = np.random.normal(size=30, loc=20, scale=3)
y = 5 + 5 * X + np.random.normal(size=30)
X = sm.add_constant(predictor) # intercept required by statsmodels
lm = sm.OLS(y, X).fit()
# Plot lmdiag facet chart
lmdiag.style.use(style="black_and_red") # Mimic R's plot.lm style
fig = lmdiag.plot(lm)
fig.show()
Methods
-
Draw matrix of all plots:
lmdiag.plot(lm)
-
Draw individual plots:
lmdiag.resid_fit(lm)
lmdiag.q_q(lm)
lmdiag.scale_loc(lm)
lmdiag.resid_lev(lm)
-
Print description to aid plot interpretation:
lmdiag.help()
(for all plots)lmdiag.help('<method name>')
(for individual plot)
Increase performance
Plotting models fitted on large datasets might be slow. There are some things you can try to speed it up:
1. Tune LOWESS-parameters
The red smoothing lines are calculated using the "Locally Weighted Scatterplot
Smoothing" algorithm, which can be quite expensive. Try a lower value for lowess_it
and a higher value for lowess_delta
to gain speed at the cost of accuracy:
lmdiag.plot(lm, lowess_it=1, lowess_delta=0.02)
# Defaults are: lowess_it=2, lowess_delta=0.005
(For details about those parameters, see statsmodels docs.)
2. Change matplotlib backend
Try a different
matplotlib backend.
Especially static backends like AGG
or Cairo
should be faster, e.g.:
import matplotlib
matplotlib.use('agg')
Setup development environment
python -m venv .venv
source .venv/bin/activate
pip install -e '.[dev]'
pre-commit install
Certification
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lmdiag-0.4.1.tar.gz
.
File metadata
- Download URL: lmdiag-0.4.1.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 06bd1fad0b1fe66027254e159ba298c6a89c2ace8d3466895c528166604b3b9d |
|
MD5 | 0b3dde71450e9ad44cc15277276914a5 |
|
BLAKE2b-256 | cda25f040f1a49525eae2fd85393aea9df21319f276c83b7100759dd0b51c370 |
File details
Details for the file lmdiag-0.4.1-py3-none-any.whl
.
File metadata
- Download URL: lmdiag-0.4.1-py3-none-any.whl
- Upload date:
- Size: 13.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/5.1.0 CPython/3.12.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1c7413463ba443d798e882abdf22dd4786bb39fe1f032dd745dadd2540b0a954 |
|
MD5 | 8dee1dbfe996002f41c6a7353d5f56dd |
|
BLAKE2b-256 | 7f03e05d405178fcc8554909af25937e79aaaaa7cbe3477b40542d90e4b81062 |