An easy to use open-source library for advanced Deep Learning and Natural Language Processing
Project description
Tamnun ML
tamnun
is a python framework for Machine and Deep learning algorithms and methods especially in the field of Natural Language Processing and Transfer Learning. The aim of tamnun
is to provide an easy to use interfaces to build powerful models based on most recent SOTA methods.
For more about tamnun
, feel free to read the introduction to TamnunML on Medium.
Getting Started
tamnun
depends on several other machine learning and deep learning frameworks like pytorch
, keras
and others. To install tamnun
and all it's dependencies run:
$ git clone https://github.com/hiredscorelabs/tamnun-ml
$ cd tamnun-ml
$ python setup.py install
Or using PyPI:
pip install tamnun
Jump in and try out an example:
$ cd examples
$ python finetune_bert.py
Or take a look at the Jupyer notebooks here.
BERT
BERT stands for Bidirectional Encoder Representations from Transformers which is a language model trained by Google and introduced in their paper.
Here we use the excellent PyTorch-Pretrained-BERT library and wrap it to provide an easy to use scikit-learn interface for easy BERT fine-tuning. At the moment, tamnun
BERT classifier supports binary and multi-class classification. To fine-tune BERT on a specific task:
from tamnun.bert import BertClassifier, BertVectorizer
from sklearn.pipeline import make_pipeline
clf = make_pipeline(BertVectorizer(), BertClassifier(num_of_classes=2)).fit(train_X, train_y)
predicted = clf.predict(test_X)
Please see this notebook for full code example.
Fitting (almost) any PyTorch Module using just one line
You can use the TorchEstimator
object to fit any pytorch
module with just one line:
from torch import nn
from tamnun.core import TorchEstimator
module = nn.Linear(128, 2)
clf = TorchEstimator(module, task_type='classification').fit(train_X, train_y)
See this file for a full example of fitting nn.Linear
module on the MNIST (classification of handwritten digits) dataset.
Distiller Transfer Learning
This module distills a very big (like BERT) model into a much smaller model. Inspired by this paper.
from tamnun.bert import BertClassifier, BertVectorizer
from tamnun.transfer import Distiller
bert_clf = make_pipeline(BertVectorizer(do_truncate=True, max_len=3), BertClassifier(num_of_classes=2))
distilled_clf = make_pipeline(CountVectorizer(ngram_range=(1,3)), LinearRegression())
distiller = Distiller(teacher_model=bert_clf, teacher_predict_func=bert_clf.decision_function, student_model=distilled_clf).fit(train_texts, train_y, unlabeled_X=unlabeled_texts)
predicted_logits = distiller.transform(test_texts)
For full BERT distillation example see this notebook.
Support
Getting Help
You can ask questions and join the development discussion on Github Issues
License
Apache License 2.0 (Same as Tensorflow)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tamnun-0.1.1.tar.gz
.
File metadata
- Download URL: tamnun-0.1.1.tar.gz
- Upload date:
- Size: 8.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 973023048316115055cdf69febf56dcfcc907cb9035476839f4d80aeeb908918 |
|
MD5 | 64c8c2ad2737d1ced94bf6eb8ed71793 |
|
BLAKE2b-256 | ccec7d456fbefeaf3a3bdf2b6e2d9ff8b4eab92d62592c99b92b1d93d8f76d6f |
File details
Details for the file tamnun-0.1.1-py3-none-any.whl
.
File metadata
- Download URL: tamnun-0.1.1-py3-none-any.whl
- Upload date:
- Size: 9.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 82e1684a25883f660f9057bc88bf21f7b3f9db3c49faca204b3f72eb35898624 |
|
MD5 | a7146149da0b315d642bb3e259a94b86 |
|
BLAKE2b-256 | 85003905332b6dc3cd3f1d9921d64b6a6a80dc96938624389c25f960b58333ad |