Non-crossing quantile regression toolkit with joint multi-quantile fitting, inference, conformal calibration, and evaluation. Scikit-learn compatible.
Project description
quantile-regression-pdlp
Non-crossing quantile models with built-in inference, calibration, and evaluation.
A quantile modeling toolkit — not just a quantile regressor. Fits multiple quantiles jointly with monotonicity constraints that guarantee predictions never cross. Wraps the result in inference, conformal calibration, evaluation metrics, and crossing diagnostics.
Scikit-learn compatible. Validated against sklearn, statsmodels, and R's quantreg.
Why Not Just Fit Quantiles Independently?
When you fit quantiles one at a time (as sklearn and statsmodels do), nothing prevents the 90th percentile prediction from falling below the 10th. On real-world data with heavy tails, noise, or many quantile levels, this happens frequently:
| n | features | quantiles | Crossing rate (independent) | Crossing rate (this package) |
|---|---|---|---|---|
| 500 | 10 | 13 | 30.0% | 0% |
| 1,000 | 10 | 13 | 16.5% | 0% |
| 2,000 | 20 | 13 | 11.0% | 0% |
| 2,000 | 20 | 7 | 4.5% | 0% |
This package eliminates crossings by construction. The joint formulation also acts as beneficial regularization — achieving equal or better pinball loss than independent fitting.
Full benchmark methodology and results: Benchmarks
What You Get
This is a toolkit, not a single estimator. It covers the workflow from raw quantile regression through calibrated prediction intervals:
| Workflow | What it does |
|---|---|
| Joint Quantile Regression | Fit multiple quantiles in one call with non-crossing guarantees |
| Conformalized Quantile Regression | Calibrate intervals for finite-sample coverage guarantees |
| Censored Quantile Regression | Handle right- or left-censored (survival) data |
| Evaluation & Metrics | Pinball loss, coverage, interval score, crossing diagnostics |
| Calibration Diagnostics | Coverage by group/bin, nominal vs empirical, sharpness analysis |
| Crossing Detection & Repair | Diagnose and fix crossings from any quantile model |
Feature Comparison
| Feature | This package | sklearn | statsmodels |
|---|---|---|---|
| Multiple quantiles (joint fit) | Yes | No | No |
| Non-crossing guarantee | Yes | No | No |
| Multi-output regression | Yes | No | No |
| Analytical / kernel / cluster / bootstrap SEs | Yes | No | Partial |
| L1 / Elastic Net / SCAD / MCP | Yes | L1 only | No |
| Conformal calibration (CQR) | Yes | No | No |
| Calibration diagnostics | Yes | No | No |
| Evaluation metrics suite | Yes | Partial | No |
| Crossing detection + fix | Yes | No | No |
| Censored QR | Yes | No | No |
| Prediction intervals | Yes | No | No |
| Pseudo R² | Yes | No | Yes |
| Formula interface | Yes | No | Yes |
| Sklearn pipeline compatible | Yes | Yes | No |
Installation
pip install quantile-regression-pdlp
Optional extras:
pip install quantile-regression-pdlp[all] # formula interface + plots
pip install quantile-regression-pdlp[plot] # matplotlib only
pip install quantile-regression-pdlp[formula] # patsy only
Quick Start
import numpy as np
from quantile_regression_pdlp import QuantileRegression
X = np.random.default_rng(0).normal(size=(200, 3))
y = X @ [2.0, -1.5, 0.8] + np.random.default_rng(1).normal(scale=0.5, size=200)
# Fit 3 quantiles jointly — guaranteed non-crossing
model = QuantileRegression(tau=[0.1, 0.5, 0.9], se_method='analytical')
model.fit(X, y)
# Summaries with coefficients, SEs, p-values, and 95% CIs
print(model.summary()[0.5]['y'])
# Prediction intervals (guaranteed monotone: lower < median < upper)
interval = model.predict_interval(X[:5], coverage=0.80)
print(interval['y']['lower'], interval['y']['upper'])
Conformal Calibration
Turn raw quantile predictions into intervals with coverage guarantees:
from quantile_regression_pdlp.conformal import ConformalQuantileRegression
base = QuantileRegression(tau=[0.05, 0.5, 0.95], se_method='analytical')
cqr = ConformalQuantileRegression(base_estimator=base, coverage=0.90)
cqr.fit(X_train, y_train)
intervals = cqr.predict_interval(X_test)
print(cqr.empirical_coverage(X_test, y_test)) # should be >= 0.90
Censored Quantile Regression
For survival data with right- or left-censoring:
from quantile_regression_pdlp import CensoredQuantileRegression
model = CensoredQuantileRegression(tau=0.5, censoring='right', se_method='analytical')
model.fit(X, observed_time, event_indicator=delta)
Evaluate Any Quantile Model
The metrics and diagnostics modules work with predictions from any source — not just this package:
from quantile_regression_pdlp.metrics import quantile_evaluation_report
from quantile_regression_pdlp.postprocess import crossing_summary
# Evaluate predictions from XGBoost, LightGBM, or any other model
report = quantile_evaluation_report(y_true, predictions, taus)
crossings = crossing_summary(predictions, taus)
Regularization
QuantileRegression(tau=0.5, regularization='l1', alpha=0.1) # Lasso
QuantileRegression(tau=0.5, regularization='elasticnet', alpha=0.1, l1_ratio=0.5)
QuantileRegression(tau=0.5, regularization='scad', alpha=0.3) # Less bias on large coefficients
QuantileRegression(tau=0.5, regularization='mcp', alpha=0.3)
Inference Options
QuantileRegression(tau=0.5, se_method='analytical') # Fast asymptotic SEs
QuantileRegression(tau=0.5, se_method='kernel') # Heteroscedasticity-robust
QuantileRegression(tau=0.5, se_method='bootstrap', n_bootstrap=500)
# Cluster-robust SEs
model.fit(X, y, clusters=group_labels)
Benchmarks
Tested on heavy-tailed heteroscedastic data (Student-t noise, 10-20 features, up to 13 quantiles):
| n | features | quantiles | Crossing (this) | Crossing (sklearn) | Pinball (this) | Pinball (sklearn) |
|---|---|---|---|---|---|---|
| 500 | 10 | 7 | 0% | 11.0% | 0.5148 | 0.5166 |
| 500 | 10 | 13 | 0% | 30.0% | 0.5095 | 0.5240 |
| 1,000 | 10 | 13 | 0% | 16.5% | 0.5048 | 0.5071 |
| 2,000 | 20 | 13 | 0% | 11.0% | 0.5599 | 0.5611 |
The joint formulation also achieves slightly better pinball loss — the non-crossing constraints act as beneficial regularization.
Speed tradeoff: This package solves a single joint LP with non-crossing constraints, which is slower than fitting each quantile independently. The value is in the guarantee and the richer downstream workflows. For single-quantile fits where speed matters most, sklearn or statsmodels may be more appropriate.
Full results: Benchmarks | Reproduce locally
When to Use This Package
Use this when you need:
- Multiple quantile predictions that must not cross (production pipelines, interval forecasts)
- Statistical inference on quantile coefficients (SEs, p-values, confidence intervals)
- Calibrated prediction intervals (conformal quantile regression)
- Censored/survival quantile models
- A complete evaluation workflow for any quantile model's predictions
Use sklearn or statsmodels when:
- You only need a single quantile (e.g., median regression)
- Raw speed matters more than crossing guarantees
- You don't need inference, calibration, or evaluation tooling
Documentation
Full docs: joshvern.github.io/quantile_regression_pdlp
Implementation
Quantile regression is naturally a linear program. This package solves joint multi-quantile LPs with non-crossing constraints using:
- PDLP — first-order primal-dual solver (default, from Google OR-Tools)
- GLOP — revised simplex (faster on small/medium problems)
- HiGHS — via scipy's sparse LP interface (memory-efficient)
QuantileRegression(tau=0.5, solver_backend='GLOP') # simplex
QuantileRegression(tau=0.5, use_sparse=True) # scipy sparse
Dependencies
Required: numpy, pandas, scipy, scikit-learn, ortools, tqdm, joblib
Optional: matplotlib (plots), patsy (formulas), statsmodels (benchmarks)
Contributing
Contributions welcome! Open an issue or submit a pull request on GitHub.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file quantile_regression_pdlp-0.4.2.tar.gz.
File metadata
- Download URL: quantile_regression_pdlp-0.4.2.tar.gz
- Upload date:
- Size: 40.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f43c2661da36569b33a9f46f5b59c92151164357f3cbde4fcc1858ec769846fd
|
|
| MD5 |
357788c5864befce1b502d70e852e8dd
|
|
| BLAKE2b-256 |
4dfb10a2db7621e52375aae1252983b56dabb6aa60c6cf147631cb231b670855
|
File details
Details for the file quantile_regression_pdlp-0.4.2-py3-none-any.whl.
File metadata
- Download URL: quantile_regression_pdlp-0.4.2-py3-none-any.whl
- Upload date:
- Size: 27.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.8.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b96d2d0a661b993a32404db328db042f2812ce982803d31ee5b5f9da902deec
|
|
| MD5 |
bd81e577fc3d1a964531de52bca186c2
|
|
| BLAKE2b-256 |
d77573e85186d14bdc0cb44e843ad41986d0f8993f1c91cbbb1cf74f3a149305
|