Skip to main content

Bangla Natural Language Processing Toolkit

Project description

Bangla NLTK

banglanltk is a python package for Bengali Natural Language Processing Toolkit. It includes modules for Cleaning Text, Word Tokenization, Sentence Tokenization, Stemming, Synonym and Parts of speech tagging.

Installation

pip install banglanltk

Usage

Cleaning Text

import banglanltk as bn
s = 'আজ আকাশ পরিষ্কার!!! মনে হয় আজ আর বৃষ্টি হবে না .........!'

print(bn.clean_text(s))

Word Tokenization

import banglanltk as bn

s = 'প্রাচীন কালে মানুষ একসময় সংখ্যা বুঝানোর জন্য ঝিনুক, নুড়ি, দড়ির গিট ইত্যাদি ব্যবহার করত।'
print(bn.word_tokenize(s))

Sentence Tokenization

import banglanltk as bn

s = ''' কম্পিউটার শব্দটি গ্রিক "কম্পিউট" শব্দ থেকে এসেছে। Compute শব্দের অর্থ গণনা করা। আর কম্পিউটার শব্দের অর্থ গণনাকারী যন্ত্র। '''
print(bn.sent_tokenize(s))

Stemming

import banglanltk as bn

# For single word
print(bn.stemmer('শান্তিনিকেতনে'))

# For multiple words
text = 'আজ বৃষ্টি হবে।'
words = bn.word_tokenize(text)
for w in words:
    print(bn.stemmer(w))

Synonym

import banglanltk as bn

print(bn.synonym('হাত'))

POS Tagging

import banglanltk as bn

# For single word
print(bn.pos_tag('কম্পিউটার'))

# For multiple words
text = 'আজ বৃষ্টি হবে।'
words = bn.word_tokenize(text)
for w in words:
    print(bn.pos_tag(w))

List of POS tags

POS Meaning
CC Conjunction
CD Cardinal number
DM Demonstrative
DT Determiner
EX Existential there
FW Foreign word
IN Preposition
JJ Adjective
JJR Adjective, comparative
JJS Adjective, superlative
MD Modal
NN Noun, singular or mass
NNP Proper noun, singular
NNS Noun, plural
NNV Verbal Noun
PR Pronoun
PRP Personal pronoun
PRP$ Possessive pronoun
PSP Postposition
RB Adverb
RBR Adverb, comparative
RP Particles
SYM Symbol
TO to
UH Interjection
UNK Unknown tag
VB Verb, base form
VBD Verb, past tense
VBG Verb, present participle
VBN Verb, past participle
VBP Verb, non-3rd person singular present
WDT Wh-determiner
WH Wh words
WP Wh-pronoun
WRB Wh-adverb

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

banglanltk-0.0.4-py3-none-any.whl (462.3 kB view details)

Uploaded Python 3

File details

Details for the file banglanltk-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: banglanltk-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 462.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.46.0 CPython/3.7.3

File hashes

Hashes for banglanltk-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 544ffd7e731582c4199c254c344ca8577f465e55f70b99667de633482693a184
MD5 05a4fef28fe0bcbb503c6790d772ca4d
BLAKE2b-256 7b5fc553766812c41489748255797b53b69267ee8b2e3548d0d4fcb62878dcf7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page