Skip to main content

A simple library for Khmer text processing, with optional dependencies for different features.

Project description

Khmer Easy Tools

A simple, user-friendly Python library for common Khmer Natural Language Processing (NLP) tasks. This package uses optional dependencies to provide different features.

Installation

Install the base package (which includes is_khmer and stop word utilities):

pip install khmereasytools

Installing Optional Features

You can install the features you need. This is useful if one of the dependencies has installation issues on your system.

# To install support for khmercut (for khfilter)
pip install khmereasytools[khmercut]

# To install support for khmernltk (for khseg, khpos, khsyllable)
pip install khmereasytools[khmernltk]

# To install support for OCR (for khocr)
pip install khmereasytools[ocr]

# To install everything
pip install khmereasytools[all]

For OCR functionality, you must also install Google's Tesseract OCR engine on your system.

How to Use

Khmer Character Validation (is_khmer)

import khmereasytools as ket
print(ket.is_khmer("សួស្តី"))  # True

Keyword Extraction (khfilter)

Requires khmercut to be installed.

import khmereasytools as ket
# pip install khmereasytools[khmercut]
text = "នេះគឺជាប្រាសាទអង្គរវត្តស្ថិតនៅក្នុងខេត្តសៀមរាប"
keywords = ket.khfilter(text)
print(f"Keywords: '{{keywords}}'")

Text Segmentation (khseg)

Requires khmernltk to be installed.

import khmereasytools as ket
# pip install khmereasytools[khmernltk]
text = "នេះគឺជាប្រាសាទអង្គរវត្ត"
words = ket.khseg(text)
print(f"Segmented Words: {words}")

Syllable Segmentation (khsyllable)

Requires khmernltk to be installed.

import khmereasytools as ket
text = "សាលារៀន"
syllables = ket.khsyllable(text)
print(f"Syllables: {syllables}")

Part-of-Speech Tagging (khpos)

Requires khmernltk to be installed.

import khmereasytools as ket
text = "ខ្ញុំ ស្រឡាញ់ ភាសាខ្មែរ"
tags = ket.khpos(text)
print(f"POS Tags: {tags}")

OCR from Image (khocr)

Requires ocr dependencies to be installed.

import khmereasytools as ket
# pip install khmereasytools[ocr]
# Make sure you have an image file e.g., 'khmer_text.png'
# text_from_image = ket.khocr('khmer_text.png')
# print(f"Text from OCR: {{text_from_image}}")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

khmereasytools-0.3.5.tar.gz (6.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

khmereasytools-0.3.5-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file khmereasytools-0.3.5.tar.gz.

File metadata

  • Download URL: khmereasytools-0.3.5.tar.gz
  • Upload date:
  • Size: 6.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for khmereasytools-0.3.5.tar.gz
Algorithm Hash digest
SHA256 67959186b96dc7378325f7677ea2fe2dfad2da2a0618d25dc023dbc01a81fc7f
MD5 6b3ab0bf9dfa9c6073ceb7732117c9a2
BLAKE2b-256 f21e08cd0c3a2f5375eb2d178833d3b04e0db95f7c9963002650a84d204cd824

See more details on using hashes here.

File details

Details for the file khmereasytools-0.3.5-py3-none-any.whl.

File metadata

  • Download URL: khmereasytools-0.3.5-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.16

File hashes

Hashes for khmereasytools-0.3.5-py3-none-any.whl
Algorithm Hash digest
SHA256 7aca574a14dd8f1d16a5dcd68aba8cc8ae3292aee73e97f1a9bfefc471822977
MD5 2c0b1fe7defa0d39faac92e2639ee252
BLAKE2b-256 9556cc123174580d82d3bfacdd06acf3806145eb21ea02847ebd1e326e7be0ed

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page