bnaug is a text augmentation tool for Bangla text.
Project description
bnaug (Bangla Text Augmentation)
bnaug is a text augmentation tool for Bangla text.
Installation
pip install bnaug
- Dependencies
- pytorch >=1.7.0
Sentence Augmentation
Token Replacement
-
Mask generation based augmentation
from augban.sentence import TokenReplacement tokr = TokenReplacement() text = "আমি ঢাকায় বাস করি।" output = tokr.masking_based(text)
-
Word2Vec based augmentation
from augban.sentence import TokenReplacement tokr = TokenReplacement() text = "আমি ঢাকায় বাস করি।" model = "msc/bangla_word2vec/bnwiki_word2vec.model" output = tokr.word2vec_based(text, model=model) print(output)
-
Glove based augmentation
from augban.sentence import TokenReplacement tokr = TokenReplacement() text = "আমি ঢাকায় বাস করি।" vector = "msc/bn_glove.300d.txt" output = tokr.glove_based(text, vector_path=vector) print(output)
Back Translation
Back translation based augmentation fist translate Bangla sentence to English and then again translate the English to Bangla.
from bnaug.sentence import BackTranslation
bt = BackTranslation()
text = "বাংলা ভাষা আন্দোলন তদানীন্তন পূর্ব পাকিস্তানে সংঘটিত একটি সাংস্কৃতিক ও রাজনৈতিক আন্দোলন। "
output = bt.get_augmented_sentences(text)
print(output)
Text Generation
- Paraphrase generation
from bnaug.sentence import TextGeneration
tg = TextGeneration()
text = "বিমানটি যখন মাটিতে নামার জন্য এয়ারপোর্টের কাছাকাছি আসছে, তখন ল্যান্ডিং গিয়ারের খোপের ঢাকনাটি খুলে যায়।"
output = tg.parapharse_generation(text)
print(output)
Inspired from
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
bnaug-1.0.0.dev0.tar.gz
(3.4 kB
view details)
Built Distribution
File details
Details for the file bnaug-1.0.0.dev0.tar.gz
.
File metadata
- Download URL: bnaug-1.0.0.dev0.tar.gz
- Upload date:
- Size: 3.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 35bb27924f0abd0fefc0c99c145a94de2c00996b1d78c3cd7ae121f906a4f3f0 |
|
MD5 | cd4dcc3d9df5f6a793360ec41ff70810 |
|
BLAKE2b-256 | b52a31e59f99c35ac2efd06b8eace1ec89173a898067cc4a4eca7afb9ea77076 |
File details
Details for the file bnaug-1.0.0.dev0-py3-none-any.whl
.
File metadata
- Download URL: bnaug-1.0.0.dev0-py3-none-any.whl
- Upload date:
- Size: 4.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.7.15
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f02da604f7dbfb5b4f3ee271306e6889b513ea666ec9f4f631b9e6656b2c318e |
|
MD5 | ab9169690c2c1478e8cacd97638b50ef |
|
BLAKE2b-256 | 886eb26402bc8becdab39f5c8bbe50bed1d9f9106de596403733a289962893f2 |