A simple, modular active learning library for text classification.
Project description
Active Learning for Text Classifcation in Python.
Installation | Quick Start | Docs
Active Learning allows you to efficiently label training data in a small-data scenario.
This library provides state-of-the-art active learning for text classification, built with modularity and extensibility in mind.
Features
- Provides unified interfaces for Active Learning so that you can easily use any classifier provided by Integrates sklearn.
- (Optionally) As an optional feature, you can also use pytorch classifiers, including transformers models.
- Multiple scientifically-proven strategies re-implemented: Query Strategies, Initialization Strategies
Installation
pip install small-text
Requires Python 3.7 or newer. For using the GPU, CUDA 10.1 or newer is required.
Quick Start
For a quick start, see the provided examples for binary classification, pytorch multi-class classification, or transformer-based multi-class classification
Docs
The API docs (currently work in progress) can be generated using sphinx:
pip install sphinx sphinx-rtd-theme
cd docs/
make
Alternatives
Contribution
Contributions are welcome. Details can be found in CONTRIBUTING.md.
Acknowledgments
This software was created by @chschroeder at Leipzig University's NLP group which is a part of the Webis research network. The encompassing project was funded by the Development Bank of Saxony (SAB) under project number 100335729.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for small_text-1.0.0a3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7bed3f4ec710482452571bcd98a9585db3a4925dc71b2c641600d4cdc20c9739 |
|
MD5 | f69caec1b30d62d99845a115dd08f178 |
|
BLAKE2b-256 | e56100abb762a435a6f744c15eac86edf5a31af03015de8c8d685150e4b193d5 |