Utilities for pandas.
Project description
scikit-learn wrappers for Python fastText.
>>> from skift import FirstColFtClassifier
>>> df = pandas.DataFrame([['woof', 0], ['meow', 1]], columns=['txt', 'lbl'])
>>> sk_clf = FirstColFtClassifier()
>>> sk_clf.fit(df[['txt']], df['lbl'])
>>> sk_clf.predict([['woof']])
[0]
1 Installation
Dependencies:
numpy
scikit-learn
fastText Python package
pip install skift
NOTICE: Installing skift will not install any of its dependencies. They should be install separately.
2 Features
Adheres to the scikit-learn classifier API, including predict_proba.
Caters to the common use case of pandas.DataFrame inputs.
Enables easy stacking of fastText with other types of scikit-learn-compliant classifiers.
Pickle-able classifier objects.
Pure python.
Supports Python 3.4+.
Fully tested.
3 Wrappers
skift includes several wrappers:
3.1 Standard wrappers
FirstColFtClassifier - An sklearn classifier adapter for fasttext that takes the first column of input ndarray objects as input.
IdxBasedFtClassifier - An sklearn classifier adapter for fasttext that takes input by index.
3.2 pandas-dependent wrappers
These wrappers assume the X parameters given to fit, predict, and predict_proba methods is a pandas.DataFrame object:
FirstObjFtClassifier - An sklearn adapter for fasttext using the first object column as input.
ColLblBasedFtClassifier - An sklearn adapter for fasttext taking input by column label.
4 Contributing
Package author and current maintainer is Shay Palachy (shay.palachy@gmail.com); You are more than welcome to approach him for help. Contributions are very welcomed.
4.1 Installing for development
Clone:
git clone git@github.com:shaypal5/skift.git
Install in development mode:
cd skift
pip install -e .
4.2 Running the tests
To run the tests use:
pip install pytest pytest-cov coverage
cd skift
pytest
4.3 Adding documentation
The project is documented using the numpy docstring conventions, which were chosen as they are perhaps the most widely-spread conventions that are both supported by common tools such as Sphinx and result in human-readable docstrings. When documenting code you add to this project, follow these conventions.
5 Credits
Created by Shay Palachy (shay.palachy@gmail.com).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for skift-0.0.3-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a466f96f84d66a6950b3dba25f5ca51c8986842a6ba4f84915e8d458253fd797 |
|
MD5 | 0ba5f3ddf70d867a7c57a597e0d49cf5 |
|
BLAKE2b-256 | 38cf33357da9049035509f9b780b5208299a87259f82aac8686ea72874a7686f |