A simple Nepali stemmer
Project description
Nepali Stemmer
This is a simple Nepali stemmer. It iteratively separates out the suffixes (postpositions) until no more separation can be processed. The algorithm is based on hindi-stemmer.
Features:
- Iterative separation
- Handles the postposition attached with punctuations carefully
- Example: नेपाललाई, -> नेपाल लाई,
- Checks with Nepali dictionary
How to run
python nep_stemmer.py
To-do:
- Word transformation when stemmed
- IR evaluation
References:
- Suffix list: https://github.com/birat-bade/NepaliStemmer
- Nepali Dictionary : https://github.com/PraveshKoirala/stemmer
- Algorithm : https://github.com/sainimohit23/hindi-stemmer
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
nepali-stemmer-0.0.1.tar.gz
(144.5 kB
view hashes)
Built Distribution
nepali_stemmer-0.0.1-py3-none-any.whl
(148.8 kB
view hashes)
Close
Hashes for nepali_stemmer-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4b7c1e78f56cd6c1c47a88daa431c96c61e32a085fd2168efcfff078cc0d665b |
|
MD5 | e069901845ea7b05c305155223d1183d |
|
BLAKE2b-256 | d3a033214b92cae4ca3559b6b52df987487f13d77fd77b2ddb805861ef2864a5 |