Skip to main content

Blazing-fast Thai text processing library powered by Rust

Project description

Thongna 🌾

Thongna (ท้องนา) is a high-performance text processing library for the Thai language, built with Rust and exposed as a Python package. Inspired by PyThaiNLP, Thongna aims to provide efficient Thai language processing tools with the speed and reliability of Rust.

Features

  • Efficient Thai word segmentation: Break Thai text into meaningful tokens using the NewMM algorithm.
  • Fast and reliable: Built with Rust, Thongna offers the performance you need for large-scale text processing.
  • Python integration: Easily use Thongna in your Python projects with its simple and intuitive API.
  • Custom dictionary support: Load and use custom dictionaries for specialized segmentation tasks.
  • Text normalization: Standardize Thai text by handling common inconsistencies and variations.
  • Parallel processing: Utilize multi-core processors for faster processing of large texts.
  • Safe mode: Prevent infinite loops in tokenization for extra reliability.

Project Details

  • Version: 0.2.2 (as of the latest release)
  • License: Apache-2.0
  • Supported Python versions: 3.8+
  • Rust edition: 2021
  • Key dependencies:
    • PyO3 for Rust-Python interoperability
    • Rayon for parallel processing
    • Regex for text manipulation
  • CI/CD: Utilizes GitHub Actions for automated testing and building on multiple platforms (Linux, macOS, Windows)
  • Package distribution: Available on PyPI, with pre-built wheels for various platforms and architectures

Installation

To install Thongna, ensure you have Python 3.8+ installed, then use pip:

Why Thongna? 🌾

The name "Thongna" (ท้องนา) means "rice field" in Thai, symbolizing growth, nourishment, and the foundational aspects of life. Just like a rice field sustains life, Thongna provides the essential tools for working with Thai text, ensuring that your applications can grow and thrive.

Contributing

We welcome contributions from the community! If you’d like to contribute to Thongna, please follow these steps:

  • Fork the repository.
  • Create a new branch for your feature or bugfix.
  • Submit a pull request with a clear explanation of your changes.

License

Thongna is licensed under the Apache License. See the LICENSE file for more details.

Contact

For any questions, suggestions, or issues, feel free to open an issue or contact the maintainers directly.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thongna-0.2.3.tar.gz (339.3 kB view hashes)

Uploaded Source

Built Distributions

thongna-0.2.3-pp310-pypy310_pp73-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

thongna-0.2.3-pp310-pypy310_pp73-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

thongna-0.2.3-pp310-pypy310_pp73-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-pp310-pypy310_pp73-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

thongna-0.2.3-pp39-pypy39_pp73-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

thongna-0.2.3-pp39-pypy39_pp73-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

thongna-0.2.3-pp39-pypy39_pp73-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-pp39-pypy39_pp73-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

thongna-0.2.3-pp38-pypy38_pp73-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ x86-64

thongna-0.2.3-pp38-pypy38_pp73-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ i686

thongna-0.2.3-pp38-pypy38_pp73-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-pp38-pypy38_pp73-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded PyPy musllinux: musl 1.2+ ARM64

thongna-0.2.3-pp38-pypy38_pp73-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ s390x

thongna-0.2.3-pp38-pypy38_pp73-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-pp38-pypy38_pp73-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

thongna-0.2.3-cp312-none-win_amd64.whl (986.4 kB view hashes)

Uploaded CPython 3.12 Windows x86-64

thongna-0.2.3-cp312-none-win32.whl (891.0 kB view hashes)

Uploaded CPython 3.12 Windows x86

thongna-0.2.3-cp312-cp312-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ x86-64

thongna-0.2.3-cp312-cp312-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ i686

thongna-0.2.3-cp312-cp312-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-cp312-cp312-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.12 musllinux: musl 1.2+ ARM64

thongna-0.2.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

thongna-0.2.3-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

thongna-0.2.3-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

thongna-0.2.3-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

thongna-0.2.3-cp312-cp312-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.12 macOS 11.0+ ARM64

thongna-0.2.3-cp312-cp312-macosx_10_12_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.12 macOS 10.12+ x86-64

thongna-0.2.3-cp311-none-win_amd64.whl (985.8 kB view hashes)

Uploaded CPython 3.11 Windows x86-64

thongna-0.2.3-cp311-none-win32.whl (891.2 kB view hashes)

Uploaded CPython 3.11 Windows x86

thongna-0.2.3-cp311-cp311-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ x86-64

thongna-0.2.3-cp311-cp311-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ i686

thongna-0.2.3-cp311-cp311-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-cp311-cp311-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.2+ ARM64

thongna-0.2.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

thongna-0.2.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

thongna-0.2.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

thongna-0.2.3-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

thongna-0.2.3-cp311-cp311-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

thongna-0.2.3-cp311-cp311-macosx_10_12_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 macOS 10.12+ x86-64

thongna-0.2.3-cp310-none-win_amd64.whl (985.7 kB view hashes)

Uploaded CPython 3.10 Windows x86-64

thongna-0.2.3-cp310-none-win32.whl (891.0 kB view hashes)

Uploaded CPython 3.10 Windows x86

thongna-0.2.3-cp310-cp310-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ x86-64

thongna-0.2.3-cp310-cp310-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ i686

thongna-0.2.3-cp310-cp310-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-cp310-cp310-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.2+ ARM64

thongna-0.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

thongna-0.2.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

thongna-0.2.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

thongna-0.2.3-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

thongna-0.2.3-cp310-cp310-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

thongna-0.2.3-cp39-none-win_amd64.whl (985.9 kB view hashes)

Uploaded CPython 3.9 Windows x86-64

thongna-0.2.3-cp39-none-win32.whl (891.1 kB view hashes)

Uploaded CPython 3.9 Windows x86

thongna-0.2.3-cp39-cp39-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ x86-64

thongna-0.2.3-cp39-cp39-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ i686

thongna-0.2.3-cp39-cp39-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-cp39-cp39-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.2+ ARM64

thongna-0.2.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

thongna-0.2.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

thongna-0.2.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-cp39-cp39-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686

thongna-0.2.3-cp39-cp39-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

thongna-0.2.3-cp39-cp39-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

thongna-0.2.3-cp38-none-win_amd64.whl (985.6 kB view hashes)

Uploaded CPython 3.8 Windows x86-64

thongna-0.2.3-cp38-none-win32.whl (890.9 kB view hashes)

Uploaded CPython 3.8 Windows x86

thongna-0.2.3-cp38-cp38-musllinux_1_2_x86_64.whl (1.5 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ x86-64

thongna-0.2.3-cp38-cp38-musllinux_1_2_i686.whl (1.4 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ i686

thongna-0.2.3-cp38-cp38-musllinux_1_2_armv7l.whl (1.5 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARMv7l

thongna-0.2.3-cp38-cp38-musllinux_1_2_aarch64.whl (1.5 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.2+ ARM64

thongna-0.2.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

thongna-0.2.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (1.4 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

thongna-0.2.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (1.4 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

thongna-0.2.3-cp38-cp38-manylinux_2_17_i686.manylinux2014_i686.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686

thongna-0.2.3-cp38-cp38-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARMv7l

thongna-0.2.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.3 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page