Transformers kit - NLP library for different downstream tasks, built on huggingface project
Project description
TFKit lets everyone make use of transformer architecture on many tasks and models in small change of config.
At the same time, it can do multi-task multi-model learning, and can introduce its own data sets and tasks through simple modifications.
Feature
- One-click replacement of different pre-trained models
- Support multi-model and multi-task
- Classifier with multiple labels and multiple classifications
- Unify input formats for different tasks
- Separation of data reading and model architecture
- Support various loss function and indicators
Supplement
- Model list: Support Bert/GPT/GPT2/XLM/XLNet/RoBERTa/CTRL/ALBert/...
- NLPrep: download and preprocessing data in one line
- nlp2go: create demo api as quickly as possible.
Documentation
Learn more from the docs.
Quick Start
Installing via pip
pip install tfkit
Running TFKit to train a ner model
install nlprep and nlp2go
pip install nlprep nlp2go -U
download dataset using nlprep
nlprep --dataset tag_clner --outdir ./clner_row --util s2t
train model with albert
tfkit-train --batch 20 \
--epoch 5 \
--lr 5e-5 \
--train ./clner_row/clner-train.csv \
--test ./clner_row/clner-test.csv \
--maxlen 512 \
--model tagRow \
--savedir ./albert_ner \
--config voidful/albert_chinese_small
eval model
tfkit-eval --model ./albert_ner/3.pt --valid ./clner_row/validation.csv --metric clas
result
Task : default report
TASK: default 0
precision recall f1-score support
B_Abstract 0.00 0.00 0.00 1
B_Location 1.00 1.00 1.00 1
B_Metric 1.00 1.00 1.00 1
B_Organization 0.00 0.00 0.00 1
B_Person 1.00 1.00 1.00 1
B_Physical 0.00 0.00 0.00 1
B_Thing 1.00 1.00 1.00 1
B_Time 1.00 1.00 1.00 1
I_Abstract 1.00 1.00 1.00 1
I_Location 1.00 1.00 1.00 1
I_Metric 1.00 1.00 1.00 1
I_Organization 0.00 0.00 0.00 1
I_Person 1.00 1.00 1.00 1
I_Physical 0.00 0.00 0.00 1
I_Thing 1.00 1.00 1.00 1
I_Time 1.00 1.00 1.00 1
O 1.00 1.00 1.00 1
micro avg 1.00 0.71 0.83 17
macro avg 0.71 0.71 0.71 17
weighted avg 0.71 0.71 0.71 17
samples avg 1.00 0.71 0.83 17
host prediction service
nlp2go --model ./albert_ner/3.pt --api_path ner
You can also try tfkit in Google Colab:
Overview
Train
$ tfkit-train
Run training
arguments:
--train TRAIN [TRAIN ...] train dataset path
--test TEST [TEST ...] test dataset path
--config CONFIG distilbert-base-multilingual-cased/bert-base-multilingual-cased/voidful/albert_chinese_small
--model {once,twice,onebyone,clas,tagRow,tagCol,qa,onebyone-neg,onebyone-pos,onebyone-both} [{once,twice,onebyone,clas,tagRow,tagCol,qa,onebyone-neg,onebyone-pos,onebyone-both} ...]
model task
--savedir SAVEDIR model saving dir, default /checkpoints
optional arguments:
-h, --help show this help message and exit
--batch BATCH batch size, default 20
--lr LR [LR ...] learning rate, default 5e-5
--epoch EPOCH epoch, default 10
--maxlen MAXLEN max tokenized sequence length, default 368
--lossdrop loss dropping for text generation
--tag TAG [TAG ...] tag to identity task in multi-task
--seed SEED random seed, default 609
--worker WORKER number of worker on pre-processing, default 8
--grad_accum gradient accumulation, default 1
--tensorboard Turn on tensorboard graphing
--resume RESUME resume training
--cache cache training data
Eval
$ tfkit-eval
Run evaluation on different benchmark
arguments:
--model MODEL model path
--metric {emf1,nlg,clas} evaluate metric
--valid VALID evaluate data path
optional arguments:
-h, --help show this help message and exit
--print print each pair of evaluate data
--enable_arg_panel enable panel to input argument
Contributing
Thanks for your interest.There are many ways to contribute to this project. Get started here.
License
Icons reference
Icons modify from Freepik from www.flaticon.com
Icons modify from Nikita Golubev from www.flaticon.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file tfkit-0.5.6.tar.gz
.
File metadata
- Download URL: tfkit-0.5.6.tar.gz
- Upload date:
- Size: 219.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ee78fa8299c43c3acf19a6548188f8e10c618f94e5ea1b1cfc64070d97f19a78 |
|
MD5 | 03311dfec7ec6c61613803454924e5ca |
|
BLAKE2b-256 | e8d42867df2adab19bcfde89c80a904ef6d643211771778dba2bbd8dda319215 |
File details
Details for the file tfkit-0.5.6-py3.7.egg
.
File metadata
- Download URL: tfkit-0.5.6-py3.7.egg
- Upload date:
- Size: 135.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cc89f168f813cab093320dd0f6a631880ffc752719a6cb02258beea6175fbb5 |
|
MD5 | ef601cc8258ed1af3310cf4c461face3 |
|
BLAKE2b-256 | 39c5ed4fad3385dffde384461a492f6facd7f62c832df201aa160579c0cca755 |
File details
Details for the file tfkit-0.5.6-py3-none-any.whl
.
File metadata
- Download URL: tfkit-0.5.6-py3-none-any.whl
- Upload date:
- Size: 62.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.0.3 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0fd11a72f5e37c32413999e2b36904b0cc4a4570662b22ef40c9da3925ebd7da |
|
MD5 | 28bf2dac4abd1ae4810a3c75a9c354a6 |
|
BLAKE2b-256 | d099bf67dd94e594d4e453b30c7a11ce744258ec7577e376770a91aea0f3d729 |