Skip to main content

LABS: Linear-time Adaptive Best-subset Selection

Project description

pypi pyversion downloads issues license

Overview

Best-subset selection plays a vital role in regression analysis, aiming to identify a parsimonious subset of variables that maximizes prediction accuracy within the resulting linear model. This process is important in various scientific fields, including physics, biology, and medicine, where extensive datasets are routinely generated. Nevertheless, the computational complexity of selecting the best subset from massive datasets presents a formidable challenge, given the problem’s well-known NP-hard nature.

To address this challenge, we introduce a new tuning-free iterative algorithm scikit-labs that capitalizes on a novel subset splicing procedure. Remarkably, under mild conditions, our algorithm demonstrates provable identification of the best subset while maintaining a linear time complexity, achieving optimality in computation and statistics simultaneously. The power of scikit-labs is numerically certified by extensive test cases.

Quick Start

Installation

Install the stable scikit-labs Python package from Pypi:

pip install scikit-labs

And then the package can be imported as:

import sklabs

Example

Best subset selection for linear regression on a simulated dataset in Python:

from sklabs.datasets import make_glm_data
from sklabs.linear import LinearRegression
sim_dat = make_glm_data(n = 350, p = 500, k = 6, family = "gaussian")
model = LinearRegression()
model.fit(sim_dat.x, sim_dat.y)

Open source software

scikit-labs is a free software and its source code are publicly available in Github. The core framework is programmed in C++, and user-friendly Python interfaces are offered. You can redistribute it and/or modify it under the terms of the GPL-v3 License. We welcome contributions for scikit-labs, especially stretching scikit-labs to the other best subset selection problems.

Citation

If you use scikit-labs or reference our tutorials in a presentation or publication, we would appreciate citations of our library.

@article{scikit-labs,
    title   = {Selecting the Best Subset in Regression in Linear Time},
    author  = {Jin Zhu and Junxian Zhu and Junhao Huang and Xueqin Wang and Heping Zhang},
    journal = {Submitted},
    year    = {2023},
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scikit-labs-0.0.1rc2.tar.gz (1.5 MB view hashes)

Uploaded Source

Built Distributions

scikit_labs-0.0.1rc2-cp310-cp310-win_amd64.whl (453.2 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

scikit_labs-0.0.1rc2-cp310-cp310-win32.whl (427.1 kB view hashes)

Uploaded CPython 3.10 Windows x86

scikit_labs-0.0.1rc2-cp310-cp310-musllinux_1_1_x86_64.whl (944.3 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

scikit_labs-0.0.1rc2-cp310-cp310-musllinux_1_1_i686.whl (996.0 kB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

scikit_labs-0.0.1rc2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.7 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

scikit_labs-0.0.1rc2-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (401.9 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

scikit_labs-0.0.1rc2-cp310-cp310-macosx_11_0_arm64.whl (337.7 kB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

scikit_labs-0.0.1rc2-cp310-cp310-macosx_10_9_x86_64.whl (366.4 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

scikit_labs-0.0.1rc2-cp310-cp310-macosx_10_9_universal2.whl (669.9 kB view hashes)

Uploaded CPython 3.10 macOS 10.9+ universal2 (ARM64, x86-64)

scikit_labs-0.0.1rc2-cp39-cp39-win_amd64.whl (453.2 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

scikit_labs-0.0.1rc2-cp39-cp39-win32.whl (427.3 kB view hashes)

Uploaded CPython 3.9 Windows x86

scikit_labs-0.0.1rc2-cp39-cp39-musllinux_1_1_x86_64.whl (944.5 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

scikit_labs-0.0.1rc2-cp39-cp39-musllinux_1_1_i686.whl (996.2 kB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

scikit_labs-0.0.1rc2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.9 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

scikit_labs-0.0.1rc2-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (402.0 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686

scikit_labs-0.0.1rc2-cp39-cp39-macosx_11_0_arm64.whl (337.8 kB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

scikit_labs-0.0.1rc2-cp39-cp39-macosx_10_9_x86_64.whl (366.5 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

scikit_labs-0.0.1rc2-cp39-cp39-macosx_10_9_universal2.whl (670.1 kB view hashes)

Uploaded CPython 3.9 macOS 10.9+ universal2 (ARM64, x86-64)

scikit_labs-0.0.1rc2-cp38-cp38-win_amd64.whl (453.1 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

scikit_labs-0.0.1rc2-cp38-cp38-win32.whl (427.1 kB view hashes)

Uploaded CPython 3.8 Windows x86

scikit_labs-0.0.1rc2-cp38-cp38-musllinux_1_1_x86_64.whl (944.3 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

scikit_labs-0.0.1rc2-cp38-cp38-musllinux_1_1_i686.whl (996.0 kB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

scikit_labs-0.0.1rc2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (401.7 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

scikit_labs-0.0.1rc2-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (401.8 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686

scikit_labs-0.0.1rc2-cp38-cp38-macosx_11_0_arm64.whl (337.7 kB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

scikit_labs-0.0.1rc2-cp38-cp38-macosx_10_9_x86_64.whl (366.3 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

scikit_labs-0.0.1rc2-cp38-cp38-macosx_10_9_universal2.whl (669.8 kB view hashes)

Uploaded CPython 3.8 macOS 10.9+ universal2 (ARM64, x86-64)

scikit_labs-0.0.1rc2-cp37-cp37m-win_amd64.whl (453.6 kB view hashes)

Uploaded CPython 3.7m Windows x86-64

scikit_labs-0.0.1rc2-cp37-cp37m-win32.whl (427.6 kB view hashes)

Uploaded CPython 3.7m Windows x86

scikit_labs-0.0.1rc2-cp37-cp37m-musllinux_1_1_x86_64.whl (944.3 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

scikit_labs-0.0.1rc2-cp37-cp37m-musllinux_1_1_i686.whl (996.1 kB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ i686

scikit_labs-0.0.1rc2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (402.3 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

scikit_labs-0.0.1rc2-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl (402.0 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ i686

scikit_labs-0.0.1rc2-cp37-cp37m-macosx_10_9_x86_64.whl (366.5 kB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

scikit_labs-0.0.1rc2-cp36-cp36m-win_amd64.whl (453.6 kB view hashes)

Uploaded CPython 3.6m Windows x86-64

scikit_labs-0.0.1rc2-cp36-cp36m-win32.whl (427.6 kB view hashes)

Uploaded CPython 3.6m Windows x86

scikit_labs-0.0.1rc2-cp36-cp36m-musllinux_1_1_x86_64.whl (944.3 kB view hashes)

Uploaded CPython 3.6m musllinux: musl 1.1+ x86-64

scikit_labs-0.0.1rc2-cp36-cp36m-musllinux_1_1_i686.whl (996.1 kB view hashes)

Uploaded CPython 3.6m musllinux: musl 1.1+ i686

scikit_labs-0.0.1rc2-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (402.2 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

scikit_labs-0.0.1rc2-cp36-cp36m-manylinux_2_17_i686.manylinux2014_i686.whl (402.0 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ i686

scikit_labs-0.0.1rc2-cp36-cp36m-macosx_10_9_x86_64.whl (366.5 kB view hashes)

Uploaded CPython 3.6m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page