Data Augmentation for Japanese Text
Project description
AugLy-jp
Data Augmentation for Japanese Text on AugLy
Augmenter
base_text = "あらゆる現実をすべて自分のほうへねじ曲げたのだ"
Augmenter | Augmented | Description |
---|---|---|
SynonymAugmenter | あらゆる現実をすべて自身のほうへねじ曲げたのだ | Substitute similar word according to Sudachi synonym |
WordEmbsAugmenter | あらゆる現実をすべて関心のほうへねじ曲げたのだ | Leverage word2vec, GloVe or fasttext embeddings to apply augmentation |
FillMaskAugmenter | つまり現実を、未来な未来まで変えたいんだ | Using masked language model to generate text |
BackTranslationAugmenter | そして、ほかの人たちをそれぞれの道に安置しておられた | Leverage two translation models for augmentation |
Prerequisites
Software | Install Command |
---|---|
Python 3.8.11 | pyenv install 3.8.11 |
Poetry 1.1.* | curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python |
Get Started
Installation
pip install augly-jp
Or clone this repository:
git clone https://github.com/chck/AugLy-jp.git
poetry install
Test with reformat
poetry run task test
Reformat
poetry run task fmt
Lint
poetry run task lint
Inspired
- https://github.com/facebookresearch/AugLy
- https://github.com/makcedward/nlpaug
- https://github.com/QData/TextAttack
License
This software includes the work that is distributed in the Apache License 2.0 [1].
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
augly_jp-2021.9.30.tar.gz
(10.7 kB
view details)
Built Distribution
File details
Details for the file augly_jp-2021.9.30.tar.gz
.
File metadata
- Download URL: augly_jp-2021.9.30.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.10 CPython/3.8.12 Linux/5.8.0-1041-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 278e8ead5fbf1d5f4c94369a2d181959e59b413b1bab522d23efd474a2bb6678 |
|
MD5 | c5330c90cadd5004bc2a1a9e109362ad |
|
BLAKE2b-256 | 91458a008fd842a95bd1daf68bf783cb02eb180491d82dd48467e4d4489c6fc0 |
File details
Details for the file augly_jp-2021.9.30-py3-none-any.whl
.
File metadata
- Download URL: augly_jp-2021.9.30-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.10 CPython/3.8.12 Linux/5.8.0-1041-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fa81349d72504281a369bb19ce8399d92ea64d4a020a23fca64925126e61acb1 |
|
MD5 | b45de0f7f61733583322c6f2e030fdf1 |
|
BLAKE2b-256 | 31f38e1ec4151e4753a9b2fa2c2a8e312910d7886edb4d6bd18166b273b227d4 |