Skip to main content

LEKCut (เล็ก คัด) is a Thai tokenization library that ports the deep learning model to the onnx model.

Project description

LEKCut

pypi

LEKCut (เล็ก คัด) is a Thai tokenization library that ports the deep learning model to the onnx model.

Install

pip install lekcut

How to use

from lekcut import word_tokenize
word_tokenize("ทดสอบการตัดคำ")
# output: ['ทดสอบ', 'การ', 'ตัด', 'คำ']

API

word_tokenize(text: str, model: str="deepcut", path: str="default") -> List[str]

Model

  • deepcut - We ported deepcut model from tensorflow.keras to ONNX model. The model and code come from Deepcut's Github. The model is here.

Load custom model

If you has trained custom your model from deepcut or other that LEKCut support, You can load the custom model by path in word_tokenize after porting your model.

  • How to train custom model ith your dataset by deepcut - Notebook (Needs to update deepcut/train.py before train model)

How to porting model?

See notebooks/

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

LEKCut-0.1.tar.gz (2.0 MB view details)

Uploaded Source

Built Distribution

LEKCut-0.1-py3-none-any.whl (2.0 MB view details)

Uploaded Python 3

File details

Details for the file LEKCut-0.1.tar.gz.

File metadata

  • Download URL: LEKCut-0.1.tar.gz
  • Upload date:
  • Size: 2.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for LEKCut-0.1.tar.gz
Algorithm Hash digest
SHA256 1fbb200e129c204252369001ffcadea0e37c9871a6aa71a7819e49f1cbb973ba
MD5 835e552eb25772900d7fa26420a94819
BLAKE2b-256 fd895853548acf1f39ac4554caa27e19a547becb68b7a0aa39fc36cc91b3a5af

See more details on using hashes here.

File details

Details for the file LEKCut-0.1-py3-none-any.whl.

File metadata

  • Download URL: LEKCut-0.1-py3-none-any.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.15

File hashes

Hashes for LEKCut-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 b7e503b2816486b5998e4f0024f04ba197af131f3afd604c294be6b30060579b
MD5 d2c1c55b9fed0fe2972cdcd5391232c5
BLAKE2b-256 1c94d39131588e28e5e5c85554423d41dd7715438eb3dbd42bd92a91c52a0d7a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page