Skip to main content

A packge for solving k-sparse ridge regression certifiably!

Project description

OKRidge

docs pypi license Downloads downloads arxiv badge

This repository contains source code to our NeurIPS 2023 paper:

OKRidge: Scalable Optimal k-Sparse Ridge Regression

Table of Content

Introduction

We consider the following optimization problem:

$$\min_{\mathbf{\beta}} \sum_{i=1}^n (y_i - \mathbf{x}_i^T \mathbf{\beta})^2 + \lambda_2 \lVert \mathbf{\beta} \rVert_2^2 \quad \text{s.t.} \quad \lVert \mathbf{\beta} \rVert_0 \leq k$$

Optimal k-sparse ridge regression is a crucial ML problem that has many applications in statistics, machine learning, and data mining. However, the problem is NP-hard, and existing algorithms are either slow (using commercial MIP solvers) or suboptimal (using convex or nonconvex regularizers to approximate $\ell_0$). We propose a novel algorithm, OKRidge, that can solve the problem to provable optimality in a scalable manner.

OKRidge is based on the branch-and-bound framework. The insight leading to the computational efficiency comes from a novel lower bound calculation involving, first, a saddle point formulation, and from there, either solving (i) a linear system or (ii) using an ADMM-based approach, where the proximal operators can be efficiently evaluated by solving another linear system and an isotonic regression problem. We also propose a method to warm-start our solver, which leverages a beam search.

Experimentally, our methods attain provable optimality with run times that are orders of magnitude faster than those of the existing MIP formulations solved by the commercial solver Gurobi.

Installation

$ pip install okridge

Usage

Please see the example.ipynb jupyter notebook on GitHub for a detailed tutorial on how to use OKRidge in a python environment.

k = 10 # cardinality constraint
lambda2 = 0.1 # l2 regularization parameter
gap_tol = 1e-4 # optimality gap tolerance
verbose = True # print out the progress
time_limit = 180 # time limit in seconds

BnB_optimizer = BNBTree(X=X, y=y, lambda2=lambda2)

upper_bound, betas, optimality_gap, max_lower_bound, running_time = BnB_optimizer.solve(k = k, gap_tol = gap_tol, verbose = verbose, time_limit = time_limit)

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

okridge was created by Jiachang Liu. It is licensed under the terms of the BSD 3-Clause license.

Credits

okridge was created with cookiecutter and the py-pkgs-cookiecutter template.

Citing Our Work

If you find our work useful in your research, please consider citing the following paper:

@inproceedings{liu2023okridge,
  title={OKRidge: Scalable Optimal k-Sparse Ridge Regression},
  author={Liu, Jiachang and Rosen, Sam and Zhong, Chudi and Rudin, Cynthia},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023}
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

okridge-0.1.1.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

okridge-0.1.1-py3-none-any.whl (16.3 kB view details)

Uploaded Python 3

File details

Details for the file okridge-0.1.1.tar.gz.

File metadata

  • Download URL: okridge-0.1.1.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/5.15.0-86-generic

File hashes

Hashes for okridge-0.1.1.tar.gz
Algorithm Hash digest
SHA256 ba346329965f00f9b72625717235306a103c41f3964077ea46acc65e31c44040
MD5 c9c64a02576bbb7441e212c8567c8ce5
BLAKE2b-256 8a274ca5409c366e2907b2f450ea2c383ffadec958e945396759eeb9a6427690

See more details on using hashes here.

File details

Details for the file okridge-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: okridge-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 16.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/5.15.0-86-generic

File hashes

Hashes for okridge-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 48d9da37354b57f9ae2baf63b900f0d5a924fa3dfa244531ab6c046b193a8dc8
MD5 009ce39f8a1e43ac6b85191e21c37df4
BLAKE2b-256 577c2c8a6c03a3c8439cf4b47cf8126a9b16058fd025e77beead0176d70dd349

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page