Skip to main content

A tool for normalizing Korean text

Project description

Korean Text Normalizer

Korean Text Normalizer is a Python package for normalizing Korean text. It provides various functions to process and clean up Korean text data.

Features

  • Expand common Korean abbreviations
  • Perform basic spell checking
  • Normalize emoticons
  • Detect and correct sentence boundaries
  • Separate and combine Korean jamo (syllable characters)

Installation

You can install the package using pip:

pip install korean-text-normalizer

Usage

Here's a basic example of how to use the Korean Text Normalizer:

from korean_text_normalizer import KoreanTextNormalizer

normalizer = KoreanTextNormalizer()

text = "ㅎㅇ! 오늘 날씨가 좋네요ㄱㅅ ^_^ 내일도 날씨가 좋았으면"
normalized_text = normalizer.normalize(text)

print(normalized_text)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

korean-text-normalizer-0.1.1.tar.gz (2.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

korean_text_normalizer-0.1.1-py3-none-any.whl (2.7 kB view details)

Uploaded Python 3

File details

Details for the file korean-text-normalizer-0.1.1.tar.gz.

File metadata

  • Download URL: korean-text-normalizer-0.1.1.tar.gz
  • Upload date:
  • Size: 2.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.7

File hashes

Hashes for korean-text-normalizer-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b35e1d2e98128584cda1c1390b3b05d70416f6625740e81bf6cb0a08ee561e6e
MD5 fc1695969dfb3fb36c77f710c61ffad3
BLAKE2b-256 aac64c9feff94806f4a6f5cc751e2c25329dd6d605a289bc94a876d35793fbcb

See more details on using hashes here.

File details

Details for the file korean_text_normalizer-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for korean_text_normalizer-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 11bdef3b67145e596d717bde5fb0a7f27e6b97a085cdfe6a61012eb5d77c59c3
MD5 077210388ea34a2445376e0362f25b20
BLAKE2b-256 32370c9486ca934be0eb35e6d46de5af573115fe783b16d847b424456a8178fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page