Skip to main content

abess: Fast Best Subset Selection

Project description

logopic

Python build status R build status codecov Documentation Status cran pypi pyversions License Codacy CodeFactor

Overview

abess (Adaptive BEst Subset Selection) library aims to solve general best subset selection, i.e., find a small subset of predictors such that the resulting model is expected to have the highest accuracy. The selection for best subset shows great value in scientific researches and practical application. For example, clinicians wants to know whether a patient is health or not based on the expression level of a few of important genes.

This library implements a generic algorithm framework to find the optimal solution in an extremely fast way [1]. This framework now supports the detection of best subset under: linear regression, (multi-class) classification, censored-response modeling [2], multi-response modeling (a.k.a. multi-tasks learning), etc. It also supports the variants of best subset selection like group best subset selection [3] and nuisance best subset selection [4]. Especially, the time complexity of (group) best subset selection for linear regression is certifiably polynomial [1] [3].

Quick start

Install the stable abess Python package from Pypi:

$ pip install abess

Best subset selection for linear regression on a simulated dataset in Python:

from abess.linear import LinearRegression
from abess.datasets import make_glm_data
sim_dat = make_glm_data(n = 300, p = 1000, k = 10, family = "gaussian")
model = LinearRegression()
model.fit(sim_dat.x, sim_dat.y)

See more examples analyzed with Python in the tutorials; the notebooks are available here.

Runtime Performance

To show the power of abess in computation, we assess its timings of the CPU execution (seconds) on synthetic datasets, and compare to state-of-the-art variable selection methods. The variable selection and estimation results are deferred to performance.

We compare abess Python package with scikit-learn on linear and logistic regression. Results are presented in the below figure, and can be reproduce by running the commands in shell:

$ python ./simulation/Python/timings.py

we obtain the runtime comparison picture:

pic1

abess reaches a high efficient performance especially in linear regression where it gives the fastest solution.

Open source software

abess is a free software and its source code are publicly available in Github. The core framework is programmed in C++, and user-friendly R and Python interfaces are offered. You can redistribute it and/or modify it under the terms of the GPL-v3 License. We welcome contributions for abess, especially stretching abess to the other best subset selection problems.

Citation

If you use abess or reference our tutorials in a presentation or publication, we would appreciate citations of our library [5].

Jin Zhu, Liyuan Hu, Junhao Huang, Kangkang Jiang, Yanhang Zhang, Shiyun Lin, Junxian Zhu, Xueqin Wang (2021). “abess: A Fast Best Subset Selection Library in Python and R.” arXiv:2110.09697.

The corresponding BibteX entry:

@article{zhu-abess-arxiv,
   author  = {Jin Zhu and Liyuan Hu and Junhao Huang and Kangkang Jiang and Yanhang Zhang and Shiyun Lin and Junxian Zhu and Xueqin Wang},
   title   = {abess: A Fast Best Subset Selection Library in Python and R},
   journal = {arXiv:2110.09697},
   year    = {2021},
}

References

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

abess-0.4.0.tar.gz (1.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

abess-0.4.0-cp39-cp39-win_amd64.whl (676.1 kB view details)

Uploaded CPython 3.9Windows x86-64

abess-0.4.0-cp38-cp38-win_amd64.whl (675.6 kB view details)

Uploaded CPython 3.8Windows x86-64

abess-0.4.0-cp37-cp37m-win_amd64.whl (675.7 kB view details)

Uploaded CPython 3.7mWindows x86-64

abess-0.4.0-cp36-cp36m-win_amd64.whl (675.7 kB view details)

Uploaded CPython 3.6mWindows x86-64

abess-0.4.0-cp35-cp35m-win_amd64.whl (675.6 kB view details)

Uploaded CPython 3.5mWindows x86-64

File details

Details for the file abess-0.4.0.tar.gz.

File metadata

  • Download URL: abess-0.4.0.tar.gz
  • Upload date:
  • Size: 1.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0.tar.gz
Algorithm Hash digest
SHA256 a9b6e9d2ad5cacdfad3990355d50c47a6a7a6c41957d6cd20a3bc54dca11b919
MD5 5b2a31c62e227c288efc554c27274d06
BLAKE2b-256 8c9ca38a9355f0c37e558df7efdce432607567cdf8faebbb4e630dfb8a036414

See more details on using hashes here.

File details

Details for the file abess-0.4.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: abess-0.4.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 676.1 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 d846e2be0bfda280f0bd686101ead2ec4df5c0172ec29877e244b898f26c9368
MD5 2ea961ac515705b460c0f22d0e9c2fe6
BLAKE2b-256 e1070d982aacbe702a22fbfdfdfed73f85d1360ebd2d3f7b3b9edd5ebc6742f5

See more details on using hashes here.

File details

Details for the file abess-0.4.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: abess-0.4.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 675.6 kB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1f79d3494f139ea3d08bf399db2c12526a2b1d58d27750c961c186cf93427349
MD5 e1c550a359f3dbaf36f21ab0e60326cf
BLAKE2b-256 e9d37c30f5dcad3540ac3710bf6336a5a7b908f6741bb4b09ed9d46f3f083ae3

See more details on using hashes here.

File details

Details for the file abess-0.4.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: abess-0.4.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 675.7 kB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 af0f9c35a9fdcda727cbbd33e936b17051949b31e34053075734d7705f5dddb9
MD5 23516f781fe72a042ed7e3f06eed3cd1
BLAKE2b-256 0afcc011a60cb49e8c0668d8517b1f23dc40941a3ec5f62d4588bf44a6ee57d5

See more details on using hashes here.

File details

Details for the file abess-0.4.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: abess-0.4.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 675.7 kB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 10e51cafb2fec2c97addfa0b87c594b37913d9fbd0906b8b0cd6caada0df6290
MD5 95549d08dfc00221154b9f3cf291800f
BLAKE2b-256 cb4205b94490fa4123afd95f7d5b4d926d430e326b8265aff427cee62d2bb593

See more details on using hashes here.

File details

Details for the file abess-0.4.0-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: abess-0.4.0-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 675.6 kB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.7.3 pkginfo/1.5.0.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.7.4

File hashes

Hashes for abess-0.4.0-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 a096a02f023a9ee36504ab51667e35ca2e727ac073615b69d5bc47a0b738bcd7
MD5 15f16287fb13f38a1398635e94f8b20f
BLAKE2b-256 c46ee46c6a03c772406619a73bbc714a8fec0d4484dc6b5c7b4daafb019f23d2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page