Skip to main content

A Python package for Balinese Text Preprocessing

Project description

Package for Balinese Text Preprocessing

This is the first package to preprocess your Balinese raw texts. This package provides several functions that you can use for prepare and convert your raw text into clean version.

Installation

pip install balinese_textpreprocessor

Usage

from balinese_textpreprocessor import TextPreprocessor
sentence = "I Budi ngalahin **& I Lutunge 12354!!"
preprocessor = TextPreprocessor()
preprocessed_sentence = preprocessor.case_folding(sentence)
preprocessed_sentence = preprocessor.remove_number(preprocessed_sentence)
preprocessed_sentence = preprocessor.remove_punctuation(
    preprocessed_sentence)
preprocessed_sentence = preprocessor.normalize_words(
    preprocessed_sentence)
preprocessed_sentence = preprocessor.lemmatize_text(
    preprocessed_sentence)
print(preprocessed_sentence)

Acknowledgement

Please cite this paper if you think this package is useful:

[1] Arimbawaa, I. G. A. P., & ERa, N. A. S. (2017). Lemmatization in Balinese language. Jurnal Elektronik Ilmu Komputer Udayana p-ISSN, 2301, 5373.

[2] Pradipthaa, I. G. M. H., & ERa, N. A. S. (2020). Building balinese part-of-speech tagger using hidden markov model (HMM). Jurnal Elektronik Ilmu Komputer Udayana p-ISSN, 2301, 5373.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

balinese_textpreprocessor-1.0.5.tar.gz (54.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

balinese_textpreprocessor-1.0.5-py3-none-any.whl (53.0 kB view details)

Uploaded Python 3

File details

Details for the file balinese_textpreprocessor-1.0.5.tar.gz.

File metadata

File hashes

Hashes for balinese_textpreprocessor-1.0.5.tar.gz
Algorithm Hash digest
SHA256 9619a96fb033c145c5b4cad7a86c69ff0abf0464520f6f4d93a7cd1515cd0506
MD5 352d09b4bd1f25b961ac69c8f8a0b108
BLAKE2b-256 dd1fe7cdd054c62a3fc533fb1301bc560c59615ae6e55b1c297634ddb353ee5c

See more details on using hashes here.

Provenance

The following attestation bundles were made for balinese_textpreprocessor-1.0.5.tar.gz:

Publisher: publish.yml on satriabimantara/balinese_textpreprocessor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file balinese_textpreprocessor-1.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for balinese_textpreprocessor-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 c331b4df0d7c782e7835dd35788979257d9e0bf46a6f32cbfd661904783cc649
MD5 c2142bc810b3aa2df8f28e1b6a5e84ae
BLAKE2b-256 7c31201cda55cef6b0889915cdc1fd7a0541fd625f680b02d4e6af584189b2fe

See more details on using hashes here.

Provenance

The following attestation bundles were made for balinese_textpreprocessor-1.0.5-py3-none-any.whl:

Publisher: publish.yml on satriabimantara/balinese_textpreprocessor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page