AllenNLP integration for Shiba: Japanese CANINE model
Project description
Allennlp Integration for Shiba
allennlp-shiab-model
is a Python library that provides AllenNLP integration for shiba-model.
SHIBA is an approximate reimplementation of CANINE [1] in raw Pytorch, pretrained on the Japanese wikipedia corpus using random span masking. If you are unfamiliar with CANINE, you can think of it as a very efficient (approximately 4x as efficient) character-level BERT model. Of course, the name SHIBA comes from the identically named Japanese canine.
Example
This library enables users to specify the in a jsonnet config file. Here is an example of the model in jsonnet config file:
{
"dataset_reader": {
"tokenizer": {
"type": "shiba",
},
"token_indexers": {
"tokens": {
"type": "shiba",
}
},
},
"model": {
"shiba_embedder": {
"type": "basic",
"token_embedders": {
"shiba": {
"type": "shiba",
"eval_model": true,
}
}
}
}
}
Reference
- Joshua Tanner and Masato Hagiwara (2021). SHIBA: Japanese CANINE model. GitHub repository, GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
allennlp-shiba-0.0.1.tar.gz
(9.3 kB
view hashes)
Built Distribution
Close
Hashes for allennlp_shiba-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2f39e182f251452661e673ac91e9ef47d3be3c6924914675e742bf0e6195cc2 |
|
MD5 | 34bfbe1dbc416ddf5b591d27ae8f9144 |
|
BLAKE2b-256 | d2faca9ceecdc2d58cb3f0bfaacd6e1770aa975c1c955387248a0dea0c649161 |