Skip to main content

A simple library for Khmer text processing, with optional dependencies for different features.

Project description

Khmer Easy Tools

A simple, user-friendly Python library for common Khmer Natural Language Processing (NLP) tasks. This package uses optional dependencies to provide different features.

Installation

Install the base package (which includes is_khmer and stop word utilities):

pip install khmereasytools

Installing Optional Features

You can install the features you need. This is useful if one of the dependencies has installation issues on your system.

# To install support for khmercut (khfilter)
pip install khmereasytools[khmercut]

# To install support for khmernltk (khseg, pos_tag, syllable_segment)
pip install khmereasytools[khmernltk]

# To install support for OCR
pip install khmereasytools[ocr]

# To install everything
pip install khmereasytools[all]

For OCR functionality, you must also install Google's Tesseract OCR engine on your system.

How to Use

Khmer Character Validation (is_khmer)

import khmereasytools as ket
print(ket.is_khmer("សួស្តី"))  # True

Keyword Extraction (khfilter)

Requires khmercut to be installed.

import khmereasytools as ket
# pip install khmereasytools[khmercut]
text = "នេះគឺជាប្រាសាទអង្គរវត្តស្ថិតនៅក្នុងខេត្តសៀមរាប"
keywords = ket.khfilter(text)
print(f"Keywords: '{{keywords}}'")

Text Segmentation (khseg)

Requires khmernltk to be installed.

import khmereasytools as ket
# pip install khmereasytools[khmernltk]
text = "នេះគឺជាប្រាសាទអង្គរវត្ត"
words = ket.khseg(text)
print(f"Segmented Words: {words}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khmereasytools-0.3.4.tar.gz (6.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

khmereasytools-0.3.4-py3-none-any.whl (6.5 kB view details)

Uploaded Python 3

File details

Details for the file khmereasytools-0.3.4.tar.gz.

File metadata

  • Download URL: khmereasytools-0.3.4.tar.gz
  • Upload date:
  • Size: 6.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for khmereasytools-0.3.4.tar.gz
Algorithm Hash digest
SHA256 78246878bcb1b49b0357b71b3a31ecf25bf410737af33079f8859311a8025c91
MD5 6270279f2dad005827d50a6469079c65
BLAKE2b-256 8f2af1c44efad5d91c7a57188fba7fb057fc4a3a0c1fae62c2f7f0f9d112bd1c

See more details on using hashes here.

File details

Details for the file khmereasytools-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: khmereasytools-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 6.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for khmereasytools-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 d1393ac77e5e79e7ff8658baf0955caa44590f980daadc14d0fc787e351c0b8d
MD5 200771644131594e79738a3a3064521f
BLAKE2b-256 8d25169f918dd00897d94281cb58924784591eadfae33c27d2f2bcfe1de9c4d0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page