Large-scale sparse linear classification, regression and ranking in Python
Project description
lightning
lightning is a library for large-scale linear classification, regression and ranking in Python.
Highlights:
follows the scikit-learn API conventions
supports natively both dense and sparse data representations
computationally demanding parts implemented in Cython
Solvers supported:
primal coordinate descent
dual coordinate descent (SDCA, Prox-SDCA)
SGD, AdaGrad, SAG, SAGA, SVRG
FISTA
Example
Example that shows how to learn a multiclass classifier with group lasso penalty on the News20 dataset (c.f., Blondel et al. 2013):
from sklearn.datasets import fetch_20newsgroups_vectorized
from lightning.classification import CDClassifier
# Load News20 dataset from scikit-learn.
bunch = fetch_20newsgroups_vectorized(subset="all")
X = bunch.data
y = bunch.target
# Set classifier options.
clf = CDClassifier(penalty="l1/l2",
loss="squared_hinge",
multiclass=True,
max_iter=20,
alpha=1e-4,
C=1.0 / X.shape[0],
tol=1e-3)
# Train the model.
clf.fit(X, y)
# Accuracy
print(clf.score(X, y))
# Percentage of selected features
print(clf.n_nonzero(percentage=True))
Dependencies
lightning requires Python >= 2.7, setuptools, Numpy >= 1.3, SciPy >= 0.7 and scikit-learn >= 0.15. Building from source also requires Cython and a working C/C++ compiler. To run the tests you will also need nose >= 0.10.
Installation
Precompiled binaries for the stable version of lightning are available for the main platforms and can be installed using pip:
pip install sklearn-contrib-lightning
or conda:
conda install -c https://conda.anaconda.org/scikit-learn-contrib lightning
The development version of lightning can be installed from its git repository. In this case it is assumed that you have the git version control system, a working C++ compiler, Cython and the numpy development libraries. In order to install the development version, type:
git clone https://github.com/scikit-learn-contrib/lightning.git cd lightning python setup.py build sudo python setup.py install
Documentation
On Github
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Hashes for sklearn-contrib-lightning-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | ca0604bdbcb835b6ba2e3472d503717980c9be4eb95a94f06da5736fdec5499b |
|
MD5 | cf373cf22e9c1b0c11133589d3b14926 |
|
BLAKE2b-256 | 3787fa8f01fd3efb04e02f74ef3481a1cd910bd8c3462a8179cf971ae63bf126 |