The package scikit-activeml is a library of that covers the most relevant query strategies in active learning and implements tools to work with partially labeled data.
Project description
The project was initiated in 2020 by the Intelligent Embedded Systems Group at University Kassel.
Installation
The easiest way of installing scikit-activeml is using pip
pip install -U scikit-activeml
Example
The following code implements an active learning cycle with 20 iterations using a logistic regression classifier and uncertainty sampling. To use other classifiers, you can simply wrap classifiers from scikit-learn or use classifiers provided by scikit-activeml. Note that the main difficulty using active learning with scikit-learn is the ability to handle unlabeled data, which we denote as a specific value (MISSING_LABEL) in the label vector y. More query strategies can be found in the documentation.
import numpy as np from sklearn.linear_model import LogisticRegression from sklearn.datasets import make_classification from skactiveml.pool import UncertaintySampling from skactiveml.utils import unlabeled_indices, MISSING_LABEL from skactiveml.classifier import SklearnClassifier # Generate data set. X, y_true = make_classification(random_state=0) y = np.full(shape=y_true.shape, fill_value=MISSING_LABEL) # Create classifier and query strategy. clf = SklearnClassifier(LogisticRegression(), classes=np.unique(y_true)) qs = UncertaintySampling(method='entropy') # Execute active learning cycle. n_cycles = 20 for c in range(n_cycles): clf.fit(X, y) unlbld_idx = unlabeled_indices(y) X_cand = X[unlbld_idx] query_idx = unlbld_idx[qs.query(X_cand=X_cand, clf=clf)] y[query_idx] = y_true[query_idx] print(f'Accuracy: {clf.fit(X, y).score(X, y_true)}')
Development
More information are available in the Developer’s Guide.
Documentation
The documentation is available here: https://scikit-activeml.readthedocs.io
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scikit_activeml-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 18f336470025c48b0617d5c533a227f5c9cdef24acbaad3ef7938647debe2df1 |
|
MD5 | 3407e39cba696ab7deddbae5ee3f0b65 |
|
BLAKE2b-256 | 211186899a079030818b160f817a07706708648b18acf8de07c31adaadab979a |