Japanese tokenizer with transformers library
Project description
jptranstokenizer: Japanese Tokenzier for transformers
This is a repository for japanese tokenizer with HuggingFace library.
issue は日本語でも大丈夫です。
Table of Contents
Usage
To be added
Roadmap
See the open issues for a full list of proposed features (and known issues).
Citation
There will be another paper for this pretrained model. Be sure to check here again when you cite.
This Implementation
@misc{suzuki-2022-github,
author = {Masahiro Suzuki},
title = {jptranstokenizer: Japanese Tokenzier for transformers},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/retarfi/jptranstokenizer}}}
Licenses
The codes in this repository are distributed under the Apache License 2.0.
Related Work
- Pretrained Japanese BERT models (containing Japanese tokenizer)
- Autor Tohoku University
- https://github.com/cl-tohoku/bert-japanese
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
jptranstokenizer-0.0.3.tar.gz
(20.9 kB
view hashes)
Built Distribution
Close
Hashes for jptranstokenizer-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 857a57a39d7dc5585dc2ddbdcff4adcd308ed0545916e2edfd474c0cab5ed3b5 |
|
MD5 | 391d1477aea2e16f8a34dea62f8e6c85 |
|
BLAKE2b-256 | a8dc45589873f99414e3b2090b984e601b633a438a7b420ec1e70d3add8e2b60 |