Skip to main content

Quantile regression via linear programming using Google OR-Tools PDLP, with a scikit-learn compatible API and statistical summaries.

Project description

PyPI Python Versions CI Docs

quantile-regression-pdlp

Optimization-based quantile regression built on Google OR-Tools. Scikit-learn API, statsmodels-style summaries, and features that go beyond what either package offers.

What makes this different from sklearn or statsmodels?

  • Fits multiple quantiles jointly with non-crossing constraints
  • Multi-output regression in a single model
  • SCAD, MCP, and elastic net penalties (not just L1)
  • Analytical, bootstrap, and cluster-robust standard errors
  • Prediction intervals, quantile process plots, and pseudo R²
  • Censored quantile regression for survival data
  • Scipy sparse solver for large-scale problems
  • Validated against sklearn, statsmodels, and R's quantreg

Installation

pip install quantile-regression-pdlp

Optional extras:

pip install quantile-regression-pdlp[all]   # formula interface + plots
pip install quantile-regression-pdlp[plot]   # matplotlib only
pip install quantile-regression-pdlp[formula] # patsy only

Quick Start

import numpy as np
from quantile_regression_pdlp import QuantileRegression

X = np.random.default_rng(0).normal(size=(200, 3))
y = X @ [2.0, -1.5, 0.8] + np.random.default_rng(1).normal(scale=0.5, size=200)

model = QuantileRegression(tau=[0.1, 0.5, 0.9], n_bootstrap=200, random_state=0)
model.fit(X, y)

# Summaries with coefficients, SEs, p-values, and 95% CIs
print(model.summary()[0.5]['y'])

# Prediction intervals
interval = model.predict_interval(X[:5], coverage=0.80)
print(interval['y']['lower'], interval['y']['upper'])

# Pseudo R²
print(model.pseudo_r_squared_)

Features at a Glance

Regularization

# L1 (Lasso)
QuantileRegression(tau=0.5, regularization='l1', alpha=0.1)

# Elastic net
QuantileRegression(tau=0.5, regularization='elasticnet', alpha=0.1, l1_ratio=0.5)

# SCAD (less bias on large coefficients)
QuantileRegression(tau=0.5, regularization='scad', alpha=0.3)

# MCP
QuantileRegression(tau=0.5, regularization='mcp', alpha=0.3)

Inference Options

# Fast analytical SEs (no bootstrapping needed)
model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit(X, y)

# Heteroscedasticity-robust kernel sandwich SEs
model = QuantileRegression(tau=0.5, se_method='kernel')
model.fit(X, y)

# Cluster-robust SEs
model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit(X, y, clusters=group_labels)

Quantile Process Plot

model = QuantileRegression(
    tau=[0.05, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95],
    se_method='analytical'
)
model.fit(X, y)
model.plot_quantile_process(feature='X1')

Formula Interface

model = QuantileRegression(tau=0.5, se_method='analytical')
model.fit_formula('y ~ x1 + x2 + C(region)', data=df)

Censored Quantile Regression

from quantile_regression_pdlp import CensoredQuantileRegression

model = CensoredQuantileRegression(tau=0.5, censoring='right', se_method='analytical')
model.fit(X, observed_time, event_indicator=delta)

Solver Options

# GLOP simplex (faster on small/medium problems)
QuantileRegression(tau=0.5, solver_backend='GLOP')

# Scipy sparse solver (memory-efficient for large datasets)
QuantileRegression(tau=0.5, use_sparse=True)

# Solver tuning
QuantileRegression(tau=0.5, solver_tol=1e-8, solver_time_limit=60.0)

Documentation

Full docs: joshvern.github.io/quantile_regression_pdlp

Why PDLP?

Quantile regression is naturally a linear program. OR-Tools' PDLP is a first-order solver designed for large-scale LPs, making it efficient for high-dimensional problems. For smaller problems, the package also supports GLOP (simplex) and scipy's HiGHS solver.

Dependencies

Required: ortools, numpy, pandas, scipy, tqdm, joblib, scikit-learn

Optional: matplotlib (plots), patsy (formulas)

Contributing

Contributions welcome! Open an issue or submit a pull request on GitHub.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

quantile_regression_pdlp-0.2.0.tar.gz (25.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

quantile_regression_pdlp-0.2.0-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file quantile_regression_pdlp-0.2.0.tar.gz.

File metadata

  • Download URL: quantile_regression_pdlp-0.2.0.tar.gz
  • Upload date:
  • Size: 25.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.15

File hashes

Hashes for quantile_regression_pdlp-0.2.0.tar.gz
Algorithm Hash digest
SHA256 0f0686c8e89cb97937958671139930381feacdc628fa6124caf545565d7d1218
MD5 935038fd600625cb88aa1bc314e422e8
BLAKE2b-256 7002d3ea2927b6909dc6b60e5ef760f3173a27bfada26961a8260d683e63a1c1

See more details on using hashes here.

File details

Details for the file quantile_regression_pdlp-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for quantile_regression_pdlp-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 712f7cddbda12144e93ef55cd124760aac745b45acfa466a8ec47bc4318a87b2
MD5 e93886501908b5d65de4bf89dc896827
BLAKE2b-256 760cfe2686675bd344110693009859bcfcf94c27144744f697543ec646076b41

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page