Skip to main content

Quantile regression via linear programming using Google OR-Tools PDLP, with a scikit-learn compatible API and statistical summaries.

Project description

PyPI Python Versions CI Docs

quantile-regression-pdlp

Optimization-based quantile regression built on Google OR-Tools. Scikit-learn API, statsmodels-style summaries, and features that go beyond what either package offers.

What makes this different from sklearn or statsmodels?

  • Fits multiple quantiles jointly with non-crossing constraints
  • Multi-output regression in a single model
  • SCAD, MCP, and elastic net penalties (not just L1)
  • Analytical, bootstrap, kernel, and cluster-robust standard errors
  • Conformalized quantile regression for calibrated prediction intervals
  • Evaluation metrics: pinball loss, coverage, interval score, crossing diagnostics
  • Crossing detection and rearrangement for any quantile model's predictions
  • Prediction intervals, quantile process plots, and pseudo R²
  • Censored quantile regression for survival data
  • Scipy sparse solver for large-scale problems
  • Validated against sklearn, statsmodels, and R's quantreg
Feature This package sklearn statsmodels
Multiple quantiles (joint) Yes No No
Non-crossing constraints Yes No No
Multi-output Yes No No
Analytical SEs Yes No Yes
Kernel (robust) SEs Yes No Yes
Cluster-robust SEs Yes No No
Bootstrap SEs Yes No No
L1 / Elastic Net / SCAD / MCP Yes L1 only No
Conformal calibration (CQR) Yes No No
Evaluation metrics suite Yes Partial No
Crossing detection + fix Yes No No
Prediction intervals Yes No No
Pseudo R² Yes No Yes
Formula interface Yes No Yes
Censored QR Yes No No
Sklearn pipeline compatible Yes Yes No

Installation

pip install quantile-regression-pdlp

Optional extras:

pip install quantile-regression-pdlp[all]   # formula interface + plots
pip install quantile-regression-pdlp[plot]   # matplotlib only
pip install quantile-regression-pdlp[formula] # patsy only

Quick Start

import numpy as np
from quantile_regression_pdlp import QuantileRegression

X = np.random.default_rng(0).normal(size=(200, 3))
y = X @ [2.0, -1.5, 0.8] + np.random.default_rng(1).normal(scale=0.5, size=200)

model = QuantileRegression(tau=[0.1, 0.5, 0.9], n_bootstrap=200, random_state=0)
model.fit(X, y)

# Summaries with coefficients, SEs, p-values, and 95% CIs
print(model.summary()[0.5]['y'])

# Prediction intervals
interval = model.predict_interval(X[:5], coverage=0.80)
print(interval['y']['lower'], interval['y']['upper'])

# Pseudo R²
print(model.pseudo_r_squared_)

Features at a Glance

Regularization

# L1 (Lasso)
QuantileRegression(tau=0.5, regularization='l1', alpha=0.1)

# Elastic net
QuantileRegression(tau=0.5, regularization='elasticnet', alpha=0.1, l1_ratio=0.5)

# SCAD (less bias on large coefficients)
QuantileRegression(tau=0.5, regularization='scad', alpha=0.3)

# MCP
QuantileRegression(tau=0.5, regularization='mcp', alpha=0.3)

Inference Options

# Fast analytical SEs (no bootstrapping needed)
model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit(X, y)

# Heteroscedasticity-robust kernel sandwich SEs
model = QuantileRegression(tau=0.5, se_method='kernel')
model.fit(X, y)

# Cluster-robust SEs
model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit(X, y, clusters=group_labels)

Quantile Process Plot

model = QuantileRegression(
    tau=[0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95],
    se_method='analytical'
)
model.fit(X, y)
model.plot_quantile_process(feature='X1')

Formula Interface

model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit_formula('y ~ x1 + x2 + C(region)', data=df)

Censored Quantile Regression

from quantile_regression_pdlp import CensoredQuantileRegression

model = CensoredQuantileRegression(tau=0.5, censoring='right', se_method='analytical')
model.fit(X, observed_time, event_indicator=delta)

Solver Options

# GLOP simplex (faster on small/medium problems)
QuantileRegression(tau=0.5, solver_backend='GLOP')

# Scipy sparse solver (memory-efficient for large datasets)
QuantileRegression(tau=0.5, use_sparse=True)

# Solver tuning
QuantileRegression(tau=0.5, solver_tol=1e-8, solver_time_limit=60.0)

Documentation

Full docs: joshvern.github.io/quantile_regression_pdlp

Why PDLP?

Quantile regression is naturally a linear program. OR-Tools' PDLP is a first-order solver designed for large-scale LPs, making it efficient for high-dimensional problems. For smaller problems, the package also supports GLOP (simplex) and scipy's HiGHS solver.

Dependencies

Required: ortools, numpy, pandas, scipy, tqdm, joblib, scikit-learn

Optional: matplotlib (plots), patsy (formulas)

Contributing

Contributions welcome! Open an issue or submit a pull request on GitHub.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quantile_regression_pdlp-0.3.0.tar.gz (34.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quantile_regression_pdlp-0.3.0-py3-none-any.whl (23.4 kB view details)

Uploaded Python 3

File details

Details for the file quantile_regression_pdlp-0.3.0.tar.gz.

File metadata

  • Download URL: quantile_regression_pdlp-0.3.0.tar.gz
  • Upload date:
  • Size: 34.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for quantile_regression_pdlp-0.3.0.tar.gz
Algorithm Hash digest
SHA256 9c3669fea26ef9b54cda64fd9d356acf53004e265aafd9d861127500082b0389
MD5 e7a284c982a9828eeabcc7ec807c91ac
BLAKE2b-256 b64ea5d92d77ba1227c2315e8b8524f52b553b3f4019edb59239cc40f50b4972

See more details on using hashes here.

File details

Details for the file quantile_regression_pdlp-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for quantile_regression_pdlp-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aaa2ed88d29ce4a7851ade319ff169c1f2bf69b0ab975bd3329a0c0f5cd47dd9
MD5 e9441c5d38fd215af320bd1051655c05
BLAKE2b-256 aaf5ea6632d973f61d73d1ea456609d01538a6cdc96b14e0838a98864acf03d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page