Skip to main content

Scalable sparse linear models in Python

Project description

CI Version

⚡ sparsely ⚡

sparsely is a sklearn-compatible Python module for sparse linear regression and classification. It uses an efficient cutting-plane algorithm to optimize feature selection, which scales to thousands of samples and features. This implementation follows Bertsimas & Van Parys (2017) for regression, and Bertsimas, Pauphilet & Van Parys (2021) for classification.

Full API documentation can be found here.

Quick start

You can install sparsely using pip as follows:

pip install sparsely

Here is a simple example of how use a sparsely estimator:

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sparsely import SparseLinearRegressor

X,y = make_regression(n_samples=1000, n_features=100, n_informative=10, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

estimator = SparseLinearRegressor(k=10)  # k is the max number of non-zero coefficients
estimator.fit(X_train, y_train)
print(estimator.score(X_test, y_test))

Development

Clone the repository using git:

git clone https://github.com/joshivanhoe/sparsely

Create a fresh virtual environment using venv or conda. Activate the environment and navigate to the cloned halfspace directory. Install a locally editable version of the package using pip:

pip install -e .

To check the installation has worked, you can run the tests (with coverage metrics) using pytest as follows:

pytest --cov=sparsely tests/

Contributions are welcome! To see our development priorities, refer to the open issues. Please submit a pull request with a clear description of the changes you've made.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparsely-1.1.0.tar.gz (11.6 kB view hashes)

Uploaded Source

Built Distribution

sparsely-1.1.0-py3-none-any.whl (11.4 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page