AllenNLP integration for Shiba: Japanese CANINE model
Project description
Allennlp Integration for Shiba
allennlp-shiab-model
is a Python library that provides AllenNLP integration for shiba-model.
SHIBA is an approximate reimplementation of CANINE [1] in raw Pytorch, pretrained on the Japanese wikipedia corpus using random span masking. If you are unfamiliar with CANINE, you can think of it as a very efficient (approximately 4x as efficient) character-level BERT model. Of course, the name SHIBA comes from the identically named Japanese canine.
Example
This library enables users to specify the in a jsonnet config file. Here is an example of the model in jsonnet config file:
{
"dataset_reader": {
"tokenizer": {
"type": "shiba",
},
"token_indexers": {
"tokens": {
"type": "shiba",
}
},
},
"model": {
"shiba_embedder": {
"type": "basic",
"token_embedders": {
"shiba": {
"type": "shiba",
"eval_model": true,
}
}
}
}
}
Reference
- Joshua Tanner and Masato Hagiwara (2021). SHIBA: Japanese CANINE model. GitHub repository, GitHub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
allennlp-shiba-0.1.0.tar.gz
(9.3 kB
view hashes)
Built Distribution
Close
Hashes for allennlp_shiba-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f72f7ec928b76ae4d2a770f9d19839fdd6d972a3fb37bd0b0bf4aadb5b79c269 |
|
MD5 | 369736d5cc06167e1f1bf85f31381da9 |
|
BLAKE2b-256 | 77f44bf27fa88cf8a5cd42406628235359a650c5ec74907cfa104e6193eb3edc |