Skip to main content

Transformers kit - NLP library for different downstream tasks, built on huggingface project

Project description




PyPI Download Build Last Commit

TFKit lets everyone make use of transformer architecture on many tasks and models in small change of config.
At the same time, it can do multi-task multi-model learning, and can introduce its own data sets and tasks through simple modifications.

Feature

  • One-click replacement of different pre-trained models
  • Support multi-model and multi-task
  • Classifier with multiple labels and multiple classifications
  • Unify input formats for different tasks
  • Separation of data reading and model architecture
  • Support various loss function and indicators

Supplement

  • Model list: Support Bert/GPT/GPT2/XLM/XLNet/RoBERTa/CTRL/ALBert/...
  • NLPrep: download and preprocessing data in one line
  • nlp2go: create demo api as quickly as possible.

Quick Start

Installing via pip

pip install tfkit

Running TFKit to train a ner model

download dataset using nlprep

nlprep --dataset tag_clner  --outdir ./clner_row --util s2t

train model with albert

tfkit-train --batch 10 \
--epoch 5 \
--lr 5e-5 \
--train ./clner_row/train \
--test ./clner_row/test \
--maxlen 512 \
--model tagRow \
--savedir ./albert_ner \
--config voidful/albert_chinese_small

eval model

tfkit-eval --model ./albert_ner/3.pt --valid ./clner_row/validation --metric clas

host prediction service

nlp2go --model ./albert_ner/3.pt --api_path ner

Overview

Train

$ tfkit-train
Run training

arguments:
  --train TRAIN [TRAIN ...]     train dataset path
  --test TEST [TEST ...]        test dataset path
  --config CONFIG               distilbert-base-multilingual-cased/bert-base-multilingual-cased/voidful/albert_chinese_small
  --model {once,twice,onebyone,clas,tagRow,tagCol,qa,onebyone-neg,onebyone-pos,onebyone-both} [{once,twice,onebyone,clas,tagRow,tagCol,qa,onebyone-neg,onebyone-pos,onebyone-both} ...]
                                model task
  --savedir SAVEDIR     model saving dir, default /checkpoints
optional arguments:
  -h, --help            show this help message and exit
  --batch BATCH         batch size, default 20
  --lr LR [LR ...]      learning rate, default 5e-5
  --epoch EPOCH         epoch, default 10
  --maxlen MAXLEN       max tokenized sequence length, default 368
  --lossdrop            loss dropping for text generation
  --tag TAG [TAG ...]   tag to identity task in multi-task
  --seed SEED           random seed, default 609
  --worker WORKER       number of worker on pre-processing, default 8
  --grad_accum          gradient accumulation, default 1
  --tensorboard         Turn on tensorboard graphing
  --resume RESUME       resume training
  --cache               cache training data

Eval

$ tfkit-eval
Run evaluation on different benchmark
arguments:
  --model MODEL             model path
  --metric {emf1,nlg,clas}  evaluate metric
  --valid VALID             evaluate data path

optional arguments:
  -h, --help            show this help message and exit
  --print               print each pair of evaluate data
  --enable_arg_panel    enable panel to input argument

Contributing

Thanks for your interest.There are many ways to contribute to this project. Get started here.

License PyPI - License

Icons reference

Icons modify from Freepik from www.flaticon.com
Icons modify from Nikita Golubev from www.flaticon.com

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tfkit-0.3.39.tar.gz (31.3 kB view details)

Uploaded Source

Built Distributions

tfkit-0.3.39-py3.7.egg (113.2 kB view details)

Uploaded Source

tfkit-0.3.39-py3-none-any.whl (50.4 kB view details)

Uploaded Python 3

File details

Details for the file tfkit-0.3.39.tar.gz.

File metadata

  • Download URL: tfkit-0.3.39.tar.gz
  • Upload date:
  • Size: 31.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.4

File hashes

Hashes for tfkit-0.3.39.tar.gz
Algorithm Hash digest
SHA256 0130ef752f609ce52fe05119a9444376f53e8bbcc85c0bc43c7b5bfcec32c4c8
MD5 195984e2d523b9e2592bf05e7a135fe1
BLAKE2b-256 3e43ddf05a96d68a9725da90c94441d16fb157c084dec1c40b14b8f24f7e072e

See more details on using hashes here.

File details

Details for the file tfkit-0.3.39-py3.7.egg.

File metadata

  • Download URL: tfkit-0.3.39-py3.7.egg
  • Upload date:
  • Size: 113.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.4

File hashes

Hashes for tfkit-0.3.39-py3.7.egg
Algorithm Hash digest
SHA256 c8c8b18f0f5a3b79962d49372bd991e495c1d5f20670a5261f756e6809f7f6a8
MD5 ee02c688027089fc0b61be61e706e915
BLAKE2b-256 80249f32251ad4bd6d0c14bb1dd98076a4bbc719cf93844df2a7eb4184db1426

See more details on using hashes here.

File details

Details for the file tfkit-0.3.39-py3-none-any.whl.

File metadata

  • Download URL: tfkit-0.3.39-py3-none-any.whl
  • Upload date:
  • Size: 50.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.7.4

File hashes

Hashes for tfkit-0.3.39-py3-none-any.whl
Algorithm Hash digest
SHA256 89ad3bcc80d9ca23caa82f7f50a3fff1111cbd7b1c383070f06b9f5b84fee52d
MD5 54f563e784a487a0d7b397d7f92ff9a4
BLAKE2b-256 cae9b4fac861e1a87f21862768e08209fba02ae525642b171a143b5a83fe62cb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page