Library for stemming Madurese text
Project description
mecs is a Madurese python stemming library that allows you to transform word in the Madurese language (Bahasa Madura) to their basic form (lemma). This application also presents the affixes of the word (prefix, suffix and nasal).
Instalation
pip install mecs
Example Usage
# import package
from mecs import Stem
# Create stemmer
st = Stem.Stemmer()
# stem
term = "romana"
st.stemming(term)
print("lemma : ", st.lemma)
# roma
print("prefix : ", st.prefix)
# None
print("suffix : ", st.suffix)
# na
print("nasal : ", st.nasal)
# None
Demo
Live demo : Click here!
References
- Rachman, F. H., Ifada, N., Wahyuni, S., Ramadani, G. D., & Pawitra, A. (2022, November). ModifiedECS (mECS) Algorithm for Madurese-Indonesian Rule-Based Machine Translation. In Proceedings of The 2022 International Conference of Science and Information Technology in Smart Administration (ICSINTESA) (pp. 51-56). IEEE. DOI: 10.1109/ICSINTESA56431.2022.10041470
- Ifada, N., Rachman, F. H., Syauqy, M. W. M. A., Wahyuni, S., & Pawitra, A. (2023). MadureseSet: Madurese-Indonesian Dataset. Data in Brief, 48, 109035. DOI: 10.1016/j.dib.2023.109035
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mecs-1.0.5.tar.gz
(60.7 kB
view details)
File details
Details for the file mecs-1.0.5.tar.gz
.
File metadata
- Download URL: mecs-1.0.5.tar.gz
- Upload date:
- Size: 60.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e1b2fcbe070a21cd2cdf34f14b2d9a525d0ed8b302cfc78daba15580d6343b45 |
|
MD5 | 2be2bb2a65883ef604058430f794ee33 |
|
BLAKE2b-256 | 8f8415fefc596a6f3fafd2202a67287e0180d9329a6f0e3d8a76370f8c65bc09 |