Skip to main content

Normalizer for Persian texts based on hazm

Project description

IBITNormalizer

Simple persian text-normalizer base on hazm lib

install

pip install IBITNormalizer --upgrade

import

from IBITNormalizer.normalizer import IBITNormalizer

for lm task

text = """
سلام خوبی
از بیرون چخبر
چیکارا میکنی
تازگیا هوا  چقدر سرد شده نه ؟
"""

normalizer = IBITNormalizer.forLM()
print("forLM -> ", normalizer.normalize(text))

output:
forLM ->  سلام خوبی
از چخبر
چیکارا می‌کنی
تازگیا هوا سرد نه؟

for llm task

text = """
سلام خوبی
از بیرون چخبر
چیکارا میکنی
تازگیا هوا  چقدر سرد شده نه ؟
"""

normalizer = IBITNormalizer.forLLM()
print("forLLM -> ", normalizer.normalize(text))

output:
forLLM ->  سلام خوبی
از بیرون چخبر
چیکارا می‌کنی
تازگیا هوا چقدر سرد‌شده‌نه؟

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

IBITNormalizer-1.2.8.tar.gz (23.8 kB view hashes)

Uploaded Source

Built Distribution

IBITNormalizer-1.2.8-py3-none-any.whl (26.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page