Skip to main content

Library for stemming Madurese text

Project description

mecs is a Madurese python stemming library that allows you to transform word in the Madurese language (Bahasa Madura) to their basic form (lemma). This application also presents the affixes of the word (prefix, suffix and nasal).

Instalation

pip install mecs

Example Usage

# import package
from mecs import Stem

# Create stemmer
st = Stem.Stemmer()

# stem
term = "romana"
st.stemming(term)

print("lemma : ", st.lemma)
# roma

print("prefix : ", st.prefix)
# None

print("suffix : ", st.suffix)
# na

print("nasal : ", st.nasal)
# None

Demo

Live demo : Click here!

References

  • Rachman, F. H., Ifada, N., Wahyuni, S., Ramadani, G. D., & Pawitra, A. (2022, November). ModifiedECS (mECS) Algorithm for Madurese-Indonesian Rule-Based Machine Translation. In Proceedings of The 2022 International Conference of Science and Information Technology in Smart Administration (ICSINTESA) (pp. 51-56). IEEE. DOI: 10.1109/ICSINTESA56431.2022.10041470
  • Ifada, N., Rachman, F. H., Syauqy, M. W. M. A., Wahyuni, S., & Pawitra, A. (2023). MadureseSet: Madurese-Indonesian Dataset. Data in Brief, 48, 109035. DOI: 10.1016/j.dib.2023.109035

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mecs-1.0.4.tar.gz (60.8 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page