one more transformers lib
Project description
Simple way to use transformer models
Features
- Multimodal Transformer (text, text -> text; for example for Automatic Post-Editing task);
- Label smoothing loss;
- Unlikelihood loss (sentence-level and for full context), arxiv;
- LayerDrop technique for transformer encoder/decoder layers, arxiv;
- Pre/Post LayerNorm encoders (Decoders in process), arxiv;
- ADMIN Initialization (in future), arxiv;
- Top-k/Top-p sampling with temperature, arxiv;
Installation
pip install plain-transformers
Usage
Multimodal transformer example with two tokenizers:
Step one: import model and some usefull staff;
import torch
from plain_transformers.models import MultimodalTransformer
from plain_transformers.layers import MultimodalTransformerDecoder
from plain_transformers.layers import TransformerEncoder
from plain_transformers import BPEWrapper
from plain_transformers.initializations import normal_initialization, initialize_weights
from plain_transformers.samplers.nucleus_sampler import NucleusSampler
import youtokentome as yttm
Step two: train and load tokenizers;
# train your encoder tokenizer
yttm.BPE.train(..., model='encoder_tokenizer.model')
# train your decoder tokenizer
yttm.BPE.train(..., model='decoder_tokenizer.model')
# load tokenizers
encoder_tokenizer = BPEWrapper(model='encoder_tokenizer.model')
decoder_tokenizer = BPEWrapper(model='decoder_tokenizer.model')
Step three: init out model configuration;
cfg = {
'd_model': 768,
'first_encoder': {
'first_encoder_vocab_size': encoder_tokenizer.vocab_size(),
'first_encoder_max_length': 512,
'first_encoder_pad_token_id': encoder_tokenizer.pad_id,
'first_encoder_token_type_vocab_size': 2,
'first_encoder_n_heads': 8,
'first_encoder_dim_feedforward': 2048,
'first_encoder_num_layers': 3,
'first_encoder_type': 'post_ln'
},
'second_encoder': {
'second_encoder_vocab_size': encoder_tokenizer.vocab_size(),
'second_encoder_max_length': 512,
'second_encoder_pad_token_id': encoder_tokenizer.pad_id,
'second_encoder_token_type_vocab_size': 2,
'second_encoder_n_heads': 8,
'second_encoder_dim_feedforward': 2048,
'second_encoder_num_layers': 3,
'second_encoder_type': 'post_ln'
},
'decoder': {
'decoder_max_length': 512,
'decoder_vocab_size': decoder_tokenizer.vocab_size(),
'decoder_pad_token_id': decoder_tokenizer.pad_id,
'decoder_token_type_vocab_size': 2,
'decoder_n_heads': 8,
'decoder_dim_feedforward': 2048,
'decoder_num_layers': 3,
'decoder_type': 'post_ln'
},
}
Step four: initialize model and apply weight initialisation (with default parameter std=0.02
);
model = MultimodalTransformer(
TransformerEncoder,
TransformerEncoder,
MultimodalTransformerDecoder,
cfg['d_model'],
**cfg['first_encoder'],
**cfg['second_encoder'],
**cfg['decoder'],
share_decoder_head_weights=True,
share_encoder_decoder_embeddings=False,
share_encoder_embeddings=True,
)
initialize_weights(model, normal_initialization, init_range=0.02)
Step five: train our model like ordinary seq2seq;
train(model, ...)
Step six: initialize Sampler and generate model answer;
sampler = NucleusSampler(model, encoder_tokenizer=(encoder_tokenizer, encoder_tokenizer), decoder_tokenizer=decoder_tokenizer)
sampler.generate('Hello Bob, what are you doing?', second_input_text='Fine, thanks!', top_k=5)
Example
You can find working example of NMT here.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
plain-transformers-0.0.1.6.tar.gz
(19.7 kB
view hashes)
Built Distribution
Close
Hashes for plain-transformers-0.0.1.6.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 33af6fed4c7dee80e5df43bf9cf28991fde2d3facb7bfdf46259d7ed4765e8e9 |
|
MD5 | 0dfc45f4771f54273f6ebbce84cc7df8 |
|
BLAKE2b-256 | a8d5e97c9d4dd56dcd145435d99dffffb4ba8c3cbb44a0486904a8e04ec307fa |
Close
Hashes for plain_transformers-0.0.1.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f9e8bb31587382e5d10909120bbab2ab7f4648766c32a52813d4cf3aabb0be8a |
|
MD5 | 8df93fced6e86cb7f424720d11712d2b |
|
BLAKE2b-256 | 06d3c4c5c363fc8602d0cbec646f615358256e404a80ef187ca9d2258a9bc854 |