BENDeep is a pytorch based deep learning solution for Bengali NLP Task
Project description
BENDeep
BENDeep is a pytorch based deep learning solution for Bengali NLP Task like bengali translation, bengali sentiment analysis and so on.
Installation
pip install bendeep
Dependency
- pytorch 1.5.0+
Pretrained Model
API
Sentiment Analysis
Analyzing Sentiment
This sentiment analysis model is a RNN based GRU model trained with socian sentiment dataset with loss 0.073 in 150 epochs.
Dataset size: 4000 sentences
from bendeep import sentiment
model_path = "senti_trained.pt"
vocab_path = "vocab.txt"
text = "রোহিঙ্গা মুসলমানদের দুর্ভোগের অন্ত নেই।জলে কুমির ডাংগায় বাঘ।আজকে দুটি ঘটনা আমাকে ভীষণ ব্যতিত করেছে।নিরবে কিছুক্ষন অশ্রু বিসর্জন দিয়ে মনটাকে হাল্কা করার ব্যর্থ প্রয়াস চালিয়েছি।"
sentiment.analyze(model_path, vocab_path, text)
Training Sentiment Model
To train this model you need a csv file with one column review means text and another column sentiment with 0 or 1, where 1 for positive and 0 for negative sentiment.
Example:
,review,sentiment
0,তোমাকে খুব সুন্দর লাগছে।,1
1,আজকের আবহাওয়া খুব খারাপ।,0
| review | sentiment | |
|---|---|---|
| 0 | তোমাকে খুব সুন্দর লাগছে। | 1 |
| 1 | আজকের আবহাওয়া খুব খারাপ। | 0 |
from bendeep import sentiment
data_path = "sentiment_data.csv"
sentiment.train(data_path)
# you can also pass these parameter
# sentiment.train(data_path, batch_size = 64, epochs=100, model_name="trained.pt")
after successfully training it will complete training and save model as trained.pt also save vocab file as vocab.txt
Machine Translation
Translate Bengali to English
This model is a seq2seq attentional model trained with this dataset with loss 0.0.
from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN
data_path = "data/translation/eng-ben.txt"
encoder = "models/translation/encoder.pt"
decoder = "models/translation/decoder.pt"
input_sentence = "আমার শীত করছে।"
translation.bn2en(data_path, encoder, decoder, input_sentence)
# outupt
# > আমার শীত করছে ।
# = i feel cold .
Training Translation Model
To train translation model you need a dataset in .txt format with tab separate input and target sentences.
Example:
I eat rice. আমি ভাত খাই।
He goes to school. সে বিদ্যালয়ে যায়।
from bendeep import translation
from bendeep.translation import EncoderRNN
from bendeep.translation import AttnDecoderRNN
data_path = "data/translation/eng-ben.txt"
translation.training(data_path, iteration=10000)
after successfully training it will complete training and save encoder and decoder model as encoder.pt, decoder.pt. Also display some random evaluation results.
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bendeep-1.2.tar.gz.
File metadata
- Download URL: bendeep-1.2.tar.gz
- Upload date:
- Size: 10.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
172f7b15f5c661705f753bc970051105c9e0506c89fbf0e384d3604e99090be8
|
|
| MD5 |
8d79f104dbcf03648ee8b87cb9c3a87f
|
|
| BLAKE2b-256 |
a3e1a060858ac0aa9feac62c3e1d74608395796fb89f78e14e8d418039b1781a
|
File details
Details for the file bendeep-1.2-py3-none-any.whl.
File metadata
- Download URL: bendeep-1.2-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/47.1.1 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1ecaedf24b9b036b4563a51e60122d6ebfd3db179c18fd493a1fa18aee4ac388
|
|
| MD5 |
97e8fb878753562fdae3678d7d55dc65
|
|
| BLAKE2b-256 |
4d52279d35a893bba7bad6454d74dc860850ed2cb77c7b70b18539e53d548cee
|