Data Augmentation for Japanese Text
Project description
AugLy-jp
Data Augmentation for Japanese Text on AugLy
Augmenter
base_text = "あらゆる現実をすべて自分のほうへねじ曲げたのだ"
| Augmenter | Augmented | Description |
|---|---|---|
| SynonymAugmenter | あらゆる現実をすべて自身のほうへねじ曲げたのだ | Substitute similar word according to Sudachi synonym |
| WordEmbsAugmenter | あらゆる現実をすべて関心のほうへねじ曲げたのだ | Leverage word2vec, GloVe or fasttext embeddings to apply augmentation |
| FillMaskAugmenter | つまり現実を、未来な未来まで変えたいんだ | Using masked language model to generate text |
| BackTranslationAugmenter | そして、ほかの人たちをそれぞれの道に安置しておられた | Leverage two translation models for augmentation |
Prerequisites
| Software | Install Command |
|---|---|
| Python 3.8.11 | pyenv install 3.8.11 |
| Poetry 1.1.* | curl -sSL https://raw.githubusercontent.com/python-poetry/poetry/master/get-poetry.py | python |
Get Started
Installation
pip install augly-jp
Or clone this repository:
git clone https://github.com/chck/AugLy-jp.git
poetry install
Test with reformat
poetry run task test
Reformat
poetry run task fmt
Lint
poetry run task lint
Inspired
- https://github.com/facebookresearch/AugLy
- https://github.com/makcedward/nlpaug
- https://github.com/QData/TextAttack
License
This software includes the work that is distributed in the Apache License 2.0 [1].
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
augly_jp-2021.9.30.tar.gz
(10.7 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file augly_jp-2021.9.30.tar.gz.
File metadata
- Download URL: augly_jp-2021.9.30.tar.gz
- Upload date:
- Size: 10.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.10 CPython/3.8.12 Linux/5.8.0-1041-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
278e8ead5fbf1d5f4c94369a2d181959e59b413b1bab522d23efd474a2bb6678
|
|
| MD5 |
c5330c90cadd5004bc2a1a9e109362ad
|
|
| BLAKE2b-256 |
91458a008fd842a95bd1daf68bf783cb02eb180491d82dd48467e4d4489c6fc0
|
File details
Details for the file augly_jp-2021.9.30-py3-none-any.whl.
File metadata
- Download URL: augly_jp-2021.9.30-py3-none-any.whl
- Upload date:
- Size: 12.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.10 CPython/3.8.12 Linux/5.8.0-1041-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa81349d72504281a369bb19ce8399d92ea64d4a020a23fca64925126e61acb1
|
|
| MD5 |
b45de0f7f61733583322c6f2e030fdf1
|
|
| BLAKE2b-256 |
31f38e1ec4151e4753a9b2fa2c2a8e312910d7886edb4d6bd18166b273b227d4
|