Quantile-on-Quantile Kernel-Based Regularized Least Squares — A Python implementation of KRLS (Hainmueller & Hazlett, 2014) and QQKRLS (Adebayo et al., 2024) with publication-quality MATLAB-style visualizations.
Project description
QQKRLS — Quantile-on-Quantile Kernel-Based Regularized Least Squares
A production-grade Python library implementing KRLS (Kernel Regularized Least Squares), Quantile-on-Quantile Regression, and the combined QQKRLS estimator with publication-quality MATLAB-style visualizations and comprehensive diagnostic tests for academic research.
Table of Contents
- Overview
- Theoretical Background
- Installation
- Quick Start
- API Reference
- Complete Workflow Guide
- Ported From
- References
- Author
- License
Overview
QQKRLS combines three powerful econometric methodologies into a single, unified Python toolkit:
| Component | Method | Purpose |
|---|---|---|
| KRLS | Kernel Regularized Least Squares | Nonparametric regression using Gaussian kernels with Tikhonov regularization — no linearity or additivity assumptions |
| Quantile-on-Quantile Regression | Distributional analysis examining how quantiles of X affect quantiles of Y | |
| QQKRLS | Combined Estimator | Nonparametric marginal effects across all quantile pairs (θ, τ), capturing heterogeneous relationships at different distributional locations |
Why QQKRLS?
Traditional regression methods assume:
- Linearity: the effect of X on Y is constant across all values.
- Homogeneity: the effect is the same at the mean, the tails, and everywhere in between.
QQKRLS relaxes both assumptions simultaneously, allowing researchers to discover:
- Nonlinear effects that vary across the distribution of X (θ-dimension).
- Heterogeneous impacts at different quantiles of Y (τ-dimension).
- Complex interaction patterns invisible to OLS, quantile regression, or standard KRLS alone.
Theoretical Background
1. Kernel Regularized Least Squares (KRLS)
Reference: Hainmueller, J. & Hazlett, C. (2014). Political Analysis, 22(2), 143-168.
KRLS is a machine-learning regression method that finds a function $f$ in a reproducing kernel Hilbert space (RKHS) that minimises the penalised loss:
$$\hat{f} = \arg\min_{f \in \mathcal{H}K} \sum{i=1}^{N} (y_i - f(x_i))^2 + \lambda |f|_{\mathcal{H}_K}^2$$
By the representer theorem, the solution has the form:
$$\hat{f}(x) = \sum_{i=1}^{N} c_i \cdot K(x, x_i)$$
where $K$ is a Gaussian kernel:
$$K(x_i, x_j) = \exp!\left(-\frac{|x_i - x_j|^2}{\sigma}\right)$$
The coefficient vector $c^*$ is obtained via eigendecomposition:
$$c^* = V (\Lambda + \lambda I)^{-1} V^\top y$$
where $K = V \Lambda V^\top$ is the eigendecomposition of the kernel matrix.
Marginal Effects
Pointwise marginal effects for predictor dimension $d$ are computed analytically:
$$\frac{\partial \hat{f}(x_j)}{\partial x_j^{(d)}} = \frac{-2}{\sigma} \sum_{i=1}^{N} c_i \cdot K(x_j, x_i) \cdot (x_i^{(d)} - x_j^{(d)})$$
The average marginal effect is:
$$\overline{\text{ME}}d = \frac{1}{N} \sum{j=1}^{N} \frac{\partial \hat{f}(x_j)}{\partial x_j^{(d)}}$$
Bandwidth & Regularization
| Parameter | Symbol | Selection Method |
|---|---|---|
| Bandwidth | $\sigma$ | Default = number of predictors $D$ |
| Regularization | $\lambda$ | Leave-One-Out (LOO) cross-validation via golden-section search |
The LOO error is computed efficiently using the "shortcut" formula:
$$\text{LOO} = \sum_{i=1}^{N} \left(\frac{c_i}{G^{-1}_{ii}}\right)^2, \quad \text{where } G^{-1} = V (\Lambda + \lambda I)^{-1} V^\top$$
2. Quantile-on-Quantile Regression (QQ)
Reference: Sim, N. & Zhou, H. (2015). Journal of Banking & Finance, 55, 1-12.
Quantile-on-Quantile regression examines how the $\theta$-th quantile of the independent variable $X$ affects the $\tau$-th conditional quantile of the dependent variable $Y$. This creates a two-dimensional mapping $\beta(\theta, \tau)$ that captures distributional heterogeneity.
The QQ approach:
- For each $\theta \in (0, 1)$: compute $Q_X(\theta)$ and subset data where $X \leq Q_X(\theta)$.
- For each $\tau \in (0, 1)$: estimate the $\tau$-th conditional quantile of $Y$ given the subset.
- The coefficient $\beta(\theta, \tau)$ describes the marginal effect at quantile pair $(\theta, \tau)$.
3. QQKRLS: The Combined Estimator
Reference: Adebayo, T.S., Ozkan, O. & Eweade, B.S. (2024). Journal of Cleaner Production, 440, 140832.
QQKRLS synthesises KRLS and QQ regression:
Algorithm:
-
For each X-quantile $\theta$:
- Compute the $\theta$-th quantile threshold: $q_\theta = Q_X(\theta)$
- Subset data: ${(x_i, y_i) : x_i \leq q_\theta}$
- Fit KRLS on the subset to obtain pointwise marginal effects $\left{\frac{\partial \hat{f}}{partial x}\right}$
-
For each Y-quantile $\tau$:
- The coefficient is the $\tau$-th quantile of the marginal effects distribution: $$\beta(\theta, \tau) = Q_\tau!\left(\left{\frac{\partial \hat{f}{[\theta]}(x_j)}{\partial x_j}\right}{j=1}^{n_\theta}\right)$$
-
Statistical inference via bootstrap:
- Resample with replacement $B$ times within each subset.
- Compute bootstrap standard errors, $t$-statistics, and $p$-values.
The result is a heatmap matrix $\beta(\theta, \tau)$ showing how the nonparametric marginal effect varies across both distributions.
Installation
From PyPI
pip install qqkrls
With full dependencies (interactive plots, diagnostics)
pip install qqkrls[full]
From source (development)
git clone https://github.com/merwanroudane/qqkrls.git
cd qqkrls
pip install -e .
Dependencies
| Package | Minimum Version | Purpose |
|---|---|---|
numpy |
≥ 1.20 | Core numerical computation |
pandas |
≥ 1.3 | Data manipulation and results storage |
scipy |
≥ 1.7 | Kernel distances, statistical tests |
matplotlib |
≥ 3.5 | Publication-quality visualizations |
Optional:
| Package | Purpose |
|---|---|
plotly |
Interactive 3D surface plots |
seaborn |
Enhanced statistical plots |
statsmodels |
Extended diagnostic tests |
Quick Start
KRLS — Kernel Regularized Least Squares
import numpy as np
from qqkrls import krls, plot_krls_derivatives, plot_krls_fit
# Generate nonlinear data
np.random.seed(42)
n = 200
X = np.random.randn(n, 2)
y = np.sin(X[:, 0]) + 0.5 * X[:, 1]**2 + np.random.randn(n) * 0.3
# Fit KRLS
fit = krls(X, y, col_names=["x1", "x2"])
# Print summary
fit.summary()
# Visualize marginal effects
plot_krls_derivatives(fit, var_idx=0, title="Marginal Effect of x1")
plot_krls_fit(fit)
QQKRLS — Quantile-on-Quantile KRLS
import numpy as np
from qqkrls import qqkrls, plot_qqkrls_heatmap, plot_qqkrls_3d
# Generate data
np.random.seed(42)
x = np.random.randn(200)
y = 0.5 * np.sin(x) + np.random.randn(200) * 0.3
# Run QQKRLS
result = qqkrls(y, x, n_boot=100)
# Print summary
result.summary()
# Publication-quality heatmap (Adebayo et al. style)
plot_qqkrls_heatmap(result, title="QQKRLS: X → Y", colorscale="paper")
# 3D surface (MATLAB-style)
plot_qqkrls_3d(result, title="QQKRLS Surface", colorscale="jet")
# Export LaTeX table
latex = result.export_latex(caption="QQKRLS Coefficients")
print(latex)
API Reference
Core Estimation
krls(X, y, ...)
Fit Kernel Regularized Least Squares.
from qqkrls import krls
fit = krls(
X, # (N, D) predictor matrix
y, # (N,) outcome vector
kernel="gaussian", # Kernel type: 'gaussian', 'linear', 'poly2', 'poly3', 'poly4'
lambda_=None, # Regularization parameter (None = LOO cross-validation)
sigma=None, # Gaussian kernel bandwidth (None = D)
derivative=True, # Compute pointwise marginal effects
binary=True, # Compute first differences for binary predictors
vcov=True, # Compute variance-covariance matrices
eigtrunc=None, # Eigenvalue truncation threshold (0, 1]
col_names=None, # List of predictor names
verbose=1, # 0=silent, 1=summary, 2+=detailed
)
Returns: KRLSResult with attributes:
| Attribute | Shape | Description |
|---|---|---|
coeffs |
(N, 1) | Solution vector $c^*$ |
fitted |
(N, 1) | Fitted values $\hat{y}$ |
X, y |
(N, D), (N, 1) | Original data |
K |
(N, N) | Kernel matrix |
sigma, lambda_ |
scalar | Bandwidth and regularization parameters |
R2, Looe |
scalar | R-squared and LOO error |
derivatives |
(N, D) | Pointwise marginal effects |
avg_derivatives |
(1, D) | Average marginal effects |
var_avg_derivatives |
(1, D) | Variance of average marginal effects |
vcov_c |
(N, N) | Variance-covariance of coefficients |
vcov_fitted |
(N, N) | Variance-covariance of fitted values |
Methods:
fit.summary()— Print publication-quality summary.fit.predict(newdata)— Predict for new observations.
qqkrls(y, x, ...)
Run Quantile-on-Quantile Kernel-Based Regularized Least Squares.
from qqkrls import qqkrls
result = qqkrls(
y, # Dependent variable
x, # Independent variable
y_quantiles=None, # Quantiles of Y (τ). Default: 0.05, 0.10, ..., 0.95
x_quantiles=None, # Quantiles of X (θ). Default: same as y_quantiles
sigma=None, # Gaussian kernel bandwidth (None = auto)
lambda_=None, # Regularization parameter (None = LOO CV per subset)
min_obs=15, # Minimum observations in a subset to fit KRLS
n_boot=200, # Bootstrap replications for standard errors
verbose=True, # Print progress
)
Returns: QQKRLSResult with attributes:
| Attribute | Type | Description |
|---|---|---|
results |
DataFrame | Long-format table: y_quantile, x_quantile, coefficient, std_error, t_value, p_value, significance |
y_quantiles |
ndarray | Y-quantile grid |
x_quantiles |
ndarray | X-quantile grid |
n_obs |
int | Number of observations |
Methods:
result.summary()— Print summary statistics.result.to_matrix(value)— Pivot to (τ × θ) matrix.result.significance_matrix(alpha)— Binary significance matrix.result.stars_matrix()— Matrix of significance stars.result.to_dataframe()— Full results DataFrame.result.export_csv(path)— Export to CSV.result.export_latex(...)— Export to LaTeX table.
multi_qqkrls(y, X, ...)
Run QQKRLS for each column of X against y (multi-variable panel).
from qqkrls import multi_qqkrls
results_dict = multi_qqkrls(
y, X,
col_names=["GDP", "CO2", "Energy"],
n_boot=200,
verbose=True,
)
# Returns: dict mapping {var_name: QQKRLSResult}
Visualization Functions
All plots follow MATLAB-style aesthetics for publication in top academic journals.
| Function | Description | Key Parameters |
|---|---|---|
plot_qqkrls_heatmap(result, ...) |
Paper-style coefficient heatmap with significance stars | colorscale, show_stars, show_values, vmin, vmax |
plot_qqkrls_3d(result, ...) |
3D surface of QQKRLS coefficients (MATLAB-style) | elev, azim, colorscale |
plot_qqkrls_contour(result, ...) |
Filled contour plot of coefficients | levels, colorscale |
plot_qqkrls_pvalue(result, ...) |
P-value heatmap with significance regions | alpha |
plot_qqkrls_panel(results_dict, ...) |
Multi-variable panel of heatmaps | dep_var, colorscale |
plot_krls_derivatives(fit, ...) |
Pointwise marginal effects scatter + density | var_idx |
plot_krls_fit(fit, ...) |
Actual vs fitted scatter (45° line) | — |
plot_krls_panel(fit, ...) |
Panel of marginal-effect scatters for all predictors | — |
Available Colorscales
| Name | Description |
|---|---|
"paper" |
Red–White–Green diverging (Adebayo et al. 2024 style) — default |
"paper_warm" |
Red–White warm gradient |
"jet" |
Classic MATLAB Jet |
"rdylgn" |
Red–Yellow–Green diverging |
"coolwarm" |
Blue–Red diverging |
"viridis" |
Perceptually uniform |
"plasma" |
Perceptually uniform warm |
Diagnostic Functions
Pre-Estimation Diagnostics
from qqkrls import linearity_test_battery, print_diagnostics
# Run battery of tests to justify using KRLS/QQKRLS
diag_df = linearity_test_battery(y, X, col_names=["x1", "x2"])
print_diagnostics(diag_df)
Tests included:
- Ramsey RESET — Functional form misspecification
- Breusch-Pagan — Heteroskedasticity
- Jarque-Bera — Non-normality of residuals
- BDS — Nonlinear dependence in residuals
Post-Estimation KRLS Diagnostics
from qqkrls import krls_residual_diagnostics, print_krls_diagnostics
diag = krls_residual_diagnostics(fit)
print_krls_diagnostics(diag)
Reports: R², Adjusted R², Effective df, AIC, BIC, Durbin-Watson, MAE, RMSE, Jarque-Bera.
Individual Diagnostic Tests
from qqkrls import bds_test, parameter_stability_test, jarque_bera
# BDS test for nonlinear dependence
bds = bds_test(series, max_dim=6)
# Returns: {'dimensions': [2..6], 'z_stats': [...], 'p_values': [...]}
# Andrews parameter stability test
stab = parameter_stability_test(series, trim=0.15)
# Returns: {'max_f': ..., 'exp_f': ..., 'ave_f': ...}
# Jarque-Bera normality test
jb = jarque_bera(series)
# Returns: {'statistic': ..., 'p_value': ...}
Table & Export Functions
from qqkrls import (
qqkrls_coefficient_table,
krls_summary_table,
descriptive_statistics,
export_results_csv,
)
# LaTeX coefficient table with significance stars
latex = qqkrls_coefficient_table(result, digits=3, show_stars=True)
# LaTeX table of KRLS average marginal effects
latex = krls_summary_table(fit, digits=4)
# Descriptive statistics table (Mean, Median, Std, Skew, Kurt, JB)
latex = descriptive_statistics(df, caption="Descriptive Statistics")
# CSV export
export_results_csv(result, "qqkrls_results.csv", digits=4)
Complete Workflow Guide
This section demonstrates the full research workflow from data loading to publication-ready output, mirroring the methodology in Adebayo et al. (2024).
Step 1: Data Preparation & Descriptive Statistics
import numpy as np
import pandas as pd
from qqkrls import descriptive_statistics
# Load your data
df = pd.read_csv("your_data.csv")
y = df["dependent_var"].values
X = df[["x1", "x2", "x3"]].values
# Generate descriptive statistics LaTeX table
latex = descriptive_statistics(df[["dependent_var", "x1", "x2", "x3"]])
print(latex)
Step 2: Pre-Estimation Diagnostics
Before applying KRLS/QQKRLS, demonstrate that the data exhibits nonlinearity:
from qqkrls import linearity_test_battery, print_diagnostics
diag = linearity_test_battery(y, X, col_names=["x1", "x2", "x3"])
print_diagnostics(diag)
# If RESET test rejects linearity → KRLS is justified
# If BDS test rejects i.i.d. → nonparametric approach is justified
# If Breusch-Pagan rejects homoskedasticity → quantile approach is justified
Step 3: KRLS Estimation
from qqkrls import krls, plot_krls_derivatives, plot_krls_fit, plot_krls_panel
from qqkrls import krls_residual_diagnostics, print_krls_diagnostics, krls_summary_table
# Fit KRLS (full nonparametric regression)
fit = krls(X, y, col_names=["x1", "x2", "x3"])
# Post-estimation diagnostics
diag = krls_residual_diagnostics(fit)
print_krls_diagnostics(diag)
# Visualize all marginal effects
plot_krls_panel(fit, save_path="krls_marginal_effects.png")
# Fitted vs actual
plot_krls_fit(fit, save_path="krls_fitted_actual.png")
# LaTeX table
print(krls_summary_table(fit))
Step 4: QQKRLS Estimation
from qqkrls import qqkrls, plot_qqkrls_heatmap, plot_qqkrls_3d
from qqkrls import plot_qqkrls_contour, plot_qqkrls_pvalue
# Define quantile grids
taus = np.arange(0.05, 1.0, 0.05) # 19 quantiles
thetas = np.arange(0.05, 1.0, 0.05)
# Run QQKRLS for a single X variable
result = qqkrls(y, X[:, 0], y_quantiles=taus, x_quantiles=thetas,
n_boot=200, verbose=True)
# Summary
result.summary()
# Heatmap (paper-style with significance stars)
plot_qqkrls_heatmap(result, title="QQKRLS: x1 → Y",
colorscale="paper", show_stars=True,
save_path="qqkrls_heatmap_x1.png")
# 3D surface
plot_qqkrls_3d(result, title="QQKRLS Surface: x1 → Y",
save_path="qqkrls_3d_x1.png")
# Contour plot
plot_qqkrls_contour(result, save_path="qqkrls_contour_x1.png")
# P-value heatmap
plot_qqkrls_pvalue(result, save_path="qqkrls_pvalue_x1.png")
Step 5: Multi-Variable QQKRLS
from qqkrls import multi_qqkrls, plot_qqkrls_panel
# Run QQKRLS for all independent variables
results_dict = multi_qqkrls(
y, X,
col_names=["x1", "x2", "x3"],
n_boot=200,
verbose=True,
)
# Panel of heatmaps (one per variable)
plot_qqkrls_panel(results_dict, dep_var="Y",
save_path="qqkrls_panel.png")
Step 6: Visualization
# Customize any plot
fig, ax = plot_qqkrls_heatmap(
result,
title="Custom Title",
colorscale="paper",
show_stars=True,
show_values=True, # Show coefficient values in cells
x_label=r"Quantiles of Energy ($\theta$)",
y_label=r"Quantiles of CO$_2$ ($\tau$)",
figsize=(14, 10),
vmin=-0.5,
vmax=0.5,
star_fontsize=7,
dpi=400,
save_path="custom_heatmap.png",
)
Step 7: Export & Reporting
# CSV export
result.export_csv("results.csv", digits=4)
# LaTeX table with significance stars
latex = result.export_latex(
value="coefficient",
caption="QQKRLS Marginal Effects: Energy → CO₂",
show_stars=True,
)
print(latex)
# Copy directly to your LaTeX document
with open("table_qqkrls.tex", "w") as f:
f.write(latex)
Ported From
This library is a faithful Python port of:
| Source | Version | Authors |
|---|---|---|
R CRAN KRLS |
v1.0-0 | Hainmueller & Hazlett |
R CRAN QuantileOnQuantile |
v1.0.3 | Roudane |
Python wqr |
v1.0.1 | Roudane |
References
-
Hainmueller, J. & Hazlett, C. (2014). Kernel Regularized Least Squares. Political Analysis, 22(2), 143-168. doi:10.1093/pan/mpt024
-
Sim, N. & Zhou, H. (2015). Oil Prices, US Stock Return, and the Dependence Between Their Quantiles. Journal of Banking & Finance, 55, 1-12. doi:10.1016/j.jbankfin.2015.01.013
-
Adebayo, T.S., Ozkan, O. & Eweade, B.S. (2024). Do energy efficiency R&D investments and ICT promote environmental sustainability in Sweden? A QQKRLS investigation. Journal of Cleaner Production, 440, 140832. doi:10.1016/j.jclepro.2024.140832
-
Adebayo, T.S., Meo, M.S., Eweade, B.S. & Ozkan, O. (2024). Analyzing the effects of solar energy innovations, digitalization, and economic globalization on environmental quality in the United States. Clean Technologies and Environmental Policy, 26, 4157-4176. doi:10.1007/s10098-024-02831-0
Author
Dr. Merwan Roudane
- Email: merwanroudane920@gmail.com
- GitHub: github.com/merwanroudane/qqkrls
License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file qqkrls-1.1.0.tar.gz.
File metadata
- Download URL: qqkrls-1.1.0.tar.gz
- Upload date:
- Size: 38.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2cc17eec1148a83890f4f8d970f78609c123dc00226ca381a2e14850bc80549d
|
|
| MD5 |
d9e566cc070ad3b2c1b5df424ff5ce83
|
|
| BLAKE2b-256 |
8d2a7a4bbfdc7e7a9281571a0d6b6c512aba2de7381ca8528cea44b40c8037b9
|
File details
Details for the file qqkrls-1.1.0-py3-none-any.whl.
File metadata
- Download URL: qqkrls-1.1.0-py3-none-any.whl
- Upload date:
- Size: 33.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8e46150fdda2eb6d08d72b6f9967c76f26a92fca5f743a43f454e050a9176185
|
|
| MD5 |
48c17029ccd4933c03818133ceda75f1
|
|
| BLAKE2b-256 |
43e5ae6aa5088e143fe5b4bb549ab0dd5e52f01c1bf2abbd0dc329ca187b8ede
|