Bangla Natural Language Processing Toolkit
Project description
Bangla NLTK
banglanltk is a python package for Bengali Natural Language Processing Toolkit. It includes modules for Cleaning Text, Word Tokenization, Sentence Tokenization, Stemming, Synonym and Parts of speech tagging.
Installation
pip install banglanltk
Usage
Cleaning Text
import banglanltk as bn
s = 'আজ আকাশ পরিষ্কার!!! মনে হয় আজ আর বৃষ্টি হবে না .........!'
print(bn.clean_text(s))
Word Tokenization
import banglanltk as bn
s = 'প্রাচীন কালে মানুষ একসময় সংখ্যা বুঝানোর জন্য ঝিনুক, নুড়ি, দড়ির গিট ইত্যাদি ব্যবহার করত।'
print(bn.word_tokenize(s))
Sentence Tokenization
import banglanltk as bn
s = ''' কম্পিউটার শব্দটি গ্রিক "কম্পিউট" শব্দ থেকে এসেছে। Compute শব্দের অর্থ গণনা করা। আর কম্পিউটার শব্দের অর্থ গণনাকারী যন্ত্র। '''
print(bn.sent_tokenize(s))
Stemming
import banglanltk as bn
# For single word
print(bn.stemmer('শান্তিনিকেতনে'))
# For multiple words
text = 'আজ বৃষ্টি হবে।'
words = bn.word_tokenize(text)
for w in words:
print(bn.stemmer(w))
Synonym
import banglanltk as bn
print(bn.synonym('হাত'))
POS Tagging
import banglanltk as bn
# For single word
print(bn.pos_tag('কম্পিউটার'))
# For multiple words
text = 'আজ বৃষ্টি হবে।'
words = bn.word_tokenize(text)
for w in words:
print(bn.pos_tag(w))
List of POS tags
POS | Meaning |
---|---|
CC |
Conjunction |
CD |
Cardinal number |
DM |
Demonstrative |
DT |
Determiner |
EX |
Existential there |
FW |
Foreign word |
IN |
Preposition |
JJ |
Adjective |
JJR |
Adjective, comparative |
JJS |
Adjective, superlative |
MD |
Modal |
NN |
Noun, singular or mass |
NNP |
Proper noun, singular |
NNS |
Noun, plural |
NNV |
Verbal Noun |
PR |
Pronoun |
PRP |
Personal pronoun |
PRP$ |
Possessive pronoun |
PSP |
Postposition |
RB |
Adverb |
RBR |
Adverb, comparative |
RP |
Particles |
SYM |
Symbol |
TO |
to |
UH |
Interjection |
UNK |
Unknown tag |
VB |
Verb, base form |
VBD |
Verb, past tense |
VBG |
Verb, present participle |
VBN |
Verb, past participle |
VBP |
Verb, non-3rd person singular present |
WDT |
Wh-determiner |
WH |
Wh words |
WP |
Wh-pronoun |
WRB |
Wh-adverb |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
banglanltk-0.0.4-py3-none-any.whl
(462.3 kB
view details)
File details
Details for the file banglanltk-0.0.4-py3-none-any.whl
.
File metadata
- Download URL: banglanltk-0.0.4-py3-none-any.whl
- Upload date:
- Size: 462.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 544ffd7e731582c4199c254c344ca8577f465e55f70b99667de633482693a184 |
|
MD5 | 05a4fef28fe0bcbb503c6790d772ca4d |
|
BLAKE2b-256 | 7b5fc553766812c41489748255797b53b69267ee8b2e3548d0d4fcb62878dcf7 |