A Procedure for Multicollinearity Testing using Bootstrap
Project description
Functions to detect and quantify multicollinearity via a nonparametric pairs bootstrap.
MTest reports achieved significance levels (ASL; bootstrap proportions) for two widely used rules:
- Klein's rule: flag multicollinearity if $R^2_j > R^2_g$
- VIF rule: flag multicollinearity if $\mathrm{VIF}_j$ is large, with $\mathrm{VIF}_j = \dfrac{1}{1 - R^2_j}$
Reference: Morales-Oñate & Morales-Oñate (2023). MTest: a Bootstrap Test for Multicollinearity. Revista Politécnica, 51(2), 53–62.
DOI: https://doi.org/10.33333/rp.vol51n2.05
What MTest does
Given a fitted linear model, MTest:
- Resamples rows of the model frame (pairs bootstrap)
nboottimes. - At each bootstrap replicate, recomputes the global $R^2_g$ and the auxiliary $R^2_j$
(regressing each predictor on the rest), using the same expanded design matrix as the original fit.
This is robust to
log(),I(), interactions, factors,poly(), etc. - Returns bootstrap distributions and ASL (bootstrap proportions) for:
- VIF rule (threshold on $R^2_j$):
$$ \mathrm{ASL}_{\mathrm{VIF}}(j) = \mathbb{P}\big(R^2_j > c\big) $$
Example: `valor_vif = 0.90` implies a VIF cutoff of $1 / (1 - 0.90) = 10$.
- Klein's rule:
$$ \mathrm{ASL}_{\mathrm{Klein}}(j) = \mathbb{P}\big(R^2_g < R^2_j\big). $$
These ASLs are simple bootstrap proportions of the corresponding events (no additional parametric assumptions).
Model context
Linear regression model:
$$ Y_i = \beta_0 + \beta_1 X_{1i} + \cdots + \beta_p X_{pi} + u_i, \quad i=1,\ldots,n. $$
Auxiliary regressions (one per predictor):
$$ X_{ji} = \gamma_0 + \sum_{k \ne j} \gamma_k X_{ki} + e_{ji}, \quad j=1,\ldots,p. $$
Let $R^2_g$ be the global $R^2$ and $R^2_j$ the $R^2$ of the $j$-th auxiliary regression.
Installation
pip install mtest_py
Quickstart
Example 1: Multicollinearity Test (MTest)
import pandas as pd
from mtest import mtest, mtest_summary
# Load dataset (mtcars equivalent in R)
url = "https://raw.githubusercontent.com/selva86/datasets/master/mtcars.csv"
mtcars = pd.read_csv(url)
X = mtcars[["disp", "hp", "wt", "qsec"]] # predictors
y = mtcars["mpg"].to_numpy() # response
# Run MTest
res = mtest(X, y, n_boot=500, r2_threshold=0.9, seed=123, add_intercept=True)
# Print results
print("R² global:", res["R2_global"])
print("VIF:", res["VIF_named"])
print("p-values VIF rule:", res["p_vif"])
print("p-values Klein rule:", res["p_klein"])
# Tabular summary
df_sum = mtest_summary(res, sort_by="VIF")
print(df_sum)
Example 2: Pairwise Kolmogorov–Smirnov Test
from mtest import pairwise_ks_test, ks_summary
X = mtcars[["disp", "hp", "wt", "qsec"]]
ks_res = pairwise_ks_test(X, alternative="greater")
summary = ks_summary(ks_res, digits=6)
print(summary["summary_text"])
API
mtest(X, y, n_boot=1000, nsam=None, r2_threshold=0.9, seed=None, return_distributions=True)
X: array-like(n, p)predictors. Intercept is not added automatically.y: array-like(n,)response.n_boot: bootstrap replicates.nsam: bootstrap sample size (default:n).r2_threshold: threshold on auxiliary R² used for VIF rule.seed: RNG seed.return_distributions: ifTrue, returns bootstrap arrays.
Return: dict with keys
R2_global,R2_aux(original sample),VIF(original sample),B_R2_global(n_boot,),B_R2_aux(n_boot, p), columns aligned with predictors,p_vif(dict),p_klein(dict).
Notes
- For the VIF rule we use
Pr(R²_j > r2_threshold)— passr2_thresholdaccordingly. - Klein's rule p-value is
Pr(R²_global < R²_j)across bootstrap replicates. - Numerical stability: we use least squares and guard divisions-by-zero.
Citation
Morales-Oñate, V., & Morales-Oñate, B. (2023).
MTest: a Bootstrap Test for Multicollinearity. Revista Politécnica, 51(2), 53–62.
https://doi.org/10.33333/rp.vol51n2.05
License
MIT (or your package license). Include the corresponding LICENSE file in the repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mtest_py-0.1.4.tar.gz.
File metadata
- Download URL: mtest_py-0.1.4.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
177ae4a5882c51f13bf971f15249d45f377cebe4d74eaa4f8126965bb1cf8c9f
|
|
| MD5 |
3416c74204360cc8848f2792ec73ca31
|
|
| BLAKE2b-256 |
810b3e7143c0916a093986541494cf8ded9a185abb867d81a23f0579d22f8054
|
File details
Details for the file mtest_py-0.1.4-py3-none-any.whl.
File metadata
- Download URL: mtest_py-0.1.4-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
858d62ed77a0cb738053aaf586d50234bc6f2d4f2db7fb2e56617ba748108130
|
|
| MD5 |
8a7bc631ebe6d7d67941c39fbb0dee8d
|
|
| BLAKE2b-256 |
3744459a7f0ac4e115fe77043acca1b246fd77e478b4148bcf44af494f3916ee
|