BERT for Multi-task Learning
Project description
Bert for Multi-task Learning
Install
pip install bert-multitask-learning
What is it
This a project that uses BERT to do multi-task learning with multiple GPU support.
Why do I need this
In the original BERT code, neither multi-task learning or multiple GPU training is possible. Plus, the original purpose of this project is NER which dose not have a working script in the original BERT code.
To sum up, compared to the original bert repo, this repo has the following features:
- Multi-task learning(major reason of re-writing the majority of code).
- Multiple GPU training
- Support sequence labeling (for example, NER) and Encoder-Decoder Seq2Seq(with transformer decoder).
What type of problems are supported?
- Masked LM and next sentence prediction Pre-train(pretrain)
- Classification(cls)
- Sequence Labeling(seq_tag)
- Seq2seq Labeling(seq2seq_tag)
- Seq2seq Text Generation(seq2seq_text)
- Multi-Label Classification(multi_cls)
How to run pre-defined problems
There are two types of chaining operations can be used to chain problems.
&
. If two problems have the same inputs, they can be chained using&
. Problems chained by&
will be trained at the same time.|
. If two problems don't have the same inputs, they need to be chained using|
. Problems chained by|
will be sampled to train at every instance.
For example, cws|NER|weibo_ner&weibo_cws
, one problem will be sampled at each turn, say weibo_ner&weibo_cws
, then weibo_ner
and weibo_cws
will trained for this turn together. Therefore, in a particular batch, some tasks might not be sampled, and their loss could be 0 in this batch.
Please see the examples in notebooks for more details about training, evaluation and export models.
Terminology
Terms below might help you if you want to understand the code.
- Problem: A problem is a set of (features, labels, problem_type).
- Runtime Problem String: String that provided to module to parse and train. For example:
a|b&c
. - Problem Chunk: A problem chunk is created in runtime, parsed from runtime problem string. A problem chunk could be one problem OR multiple problems that chained by
&
. For example, fora|b&c
, there are two problem chunks:a
andb&c
. Problem chunk is the sampling unit while doing sampling in training. - Same Label Space Problems: Problems that shares the same label space. More specifically, they share the same final dense layer.
- Same Feature Space Problems: Problems that shares the same feature space. ONLY problems that have same feature space can be chained by
&
.
Bert多任务学习
安装
pip install bert-multitask-learning
这是什么
这是利用BERT进行多任务学习并且支持多GPU训练的项目.
我为什么需要这个项目
在原始的BERT代码中, 是没有办法直接用多GPU进行多任务学习的. 另外, BERT并没有给出序列标注和Seq2seq的训练代码.
因此, 和原来的BERT相比, 这个项目具有以下特点:
- 多任务学习
- 多GPU训练
- 序列标注以及Encoder-decoder seq2seq的支持(用transformer decoder)
目前支持的任务类型
- Masked LM和next sentence prediction预训练(pretrain)
- 单标签分类(cls)
- 序列标注(seq_tag)
- 序列到序列标签标注(seq2seq_tag)
- 序列到序列文本生成(seq2seq_text)
- 多标签分类(multi_cls)
如何运行预定义任务
目前支持的任务
- 中文命名实体识别
- 中文分词
- 中文词性标注
可以用两种方法来将多个任务连接起来.
&
. 如果两个任务有相同的输入, 不同标签的话, 那么他们可以用&
来连接. 被&
连接起来的任务会被同时训练.|
. 如果两个任务为不同的输入, 那么他们必须用|
来连接. 被|
连接起来的任务会被随机抽取来训练.
例如, 我们定义任务cws|NER|weibo_ner&weibo_cws
, 那么在生成每一条数据时, 一个任务块会被随机抽取出来, 例如在这一次抽样中, weibo_ner&weibo_cws
被选中. 那么这次weibo_ner
和weibo_cws
会被同时训练. 因此, 在一个batch中, 有可能某些任务没有被抽中, loss为0.
训练, eval和导出模型请见notebooks
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for bert_multitask_learning-0.3.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | f56069c1a279113de1c1b9dc23bde5fd5e909fa0d27f080d1f460c65692dd6a1 |
|
MD5 | c918efa56d50ac58788b7c7ae605e0e9 |
|
BLAKE2b-256 | 78ac778d5f5eec8ef9ea39dfb18715696aeb096398a5350f53277d7b3c44f854 |
Hashes for bert_multitask_learning-0.3.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 05d4ecd111f46ba2d0c64d54f78d424a71ba2dd3ba3f04d27b6351c874f5dace |
|
MD5 | 1a2bf24008bd01af9b73965ca1555a7c |
|
BLAKE2b-256 | e317887273fac8b703068727fed38d7fc91568c7c894e87394e9d243019150cb |