Skip to main content

A simple Nepali stemmer

Project description

Nepali Stemmer

This is a simple Nepali stemmer. It iteratively separates out the suffixes (postpositions) until no more separation can be processed. The algorithm is based on hindi-stemmer.

Features:

  • Iterative separation
  • Handles the postposition attached with punctuations carefully
    • Example: नेपाललाई, -> नेपाल लाई,
  • Basic text cleaning
  • Cross-verification with Nepali dictionary

How to run

>>> from nepali_stemmer.stemmer import NepStemmer
>>> nepstem = NepStemmer()
>>> nepstem.stem("नेपालको एमाले पार्टीका झोले, मण्डलेहरु अमेरिका आउने रे !")                                                                                                      

'नेपाल को एमाले पार्टी का झोले, मण्डले हरु अमेरिका आउने रे !'

To-do:

  • Word transformation with stemming process
  • IR evaluation
  • Code-mixed data

References:

Contact

Email: oyashi

Note: Project created during COVID-19 quarantine out-of-boredom and necessity

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nepali-stemmer-0.0.2.tar.gz (144.8 kB view details)

Uploaded Source

Built Distribution

nepali_stemmer-0.0.2-py3-none-any.whl (149.0 kB view details)

Uploaded Python 3

File details

Details for the file nepali-stemmer-0.0.2.tar.gz.

File metadata

  • Download URL: nepali-stemmer-0.0.2.tar.gz
  • Upload date:
  • Size: 144.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.9

File hashes

Hashes for nepali-stemmer-0.0.2.tar.gz
Algorithm Hash digest
SHA256 c18723468c0fd73cc91ea860e402b6f282cf0885e96919465bc0203747b5d0e8
MD5 c3a9adfc949dcaae37740b06f2122f1d
BLAKE2b-256 70558ebc655ebf54eca51bd67d684bbb80e7a37111dc145730b2c00edf09df09

See more details on using hashes here.

File details

Details for the file nepali_stemmer-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: nepali_stemmer-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 149.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3 requests-toolbelt/0.9.1 tqdm/4.42.1 CPython/3.6.9

File hashes

Hashes for nepali_stemmer-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c2a446f13ccbd6f81f7cc130a5d293f67ef91392ed785ed65954c7fea011478a
MD5 b14265913fdec8d7bc8bd2783a9394da
BLAKE2b-256 a7c153db9fef18d13b1ee6b02044e56fffd04e697225ac03e113c9d8d7db3bdd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page