Normalizer for Persian texts based on hazm
Project description
IBITNormalizer
Simple persian text-normalizer base on hazm lib
install
pip install IBITNormalizer --upgrade
import
from IBITNormalizer.normalizer import IBITNormalizer
for lm task
text = """
سلام خوبی
از بیرون چخبر
چیکارا میکنی
تازگیا هوا چقدر سرد شده نه ؟
"""
normalizer = IBITNormalizer.forLM()
print("forLM -> ", normalizer.normalize(text))
output:
forLM -> سلام خوبی
از چخبر
چیکارا میکنی
تازگیا هوا سرد نه؟
for llm task
text = """
سلام خوبی
از بیرون چخبر
چیکارا میکنی
تازگیا هوا چقدر سرد شده نه ؟
"""
normalizer = IBITNormalizer.forLLM()
print("forLLM -> ", normalizer.normalize(text))
output:
forLLM -> سلام خوبی
از بیرون چخبر
چیکارا میکنی
تازگیا هوا چقدر سردشدهنه؟
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
IBITNormalizer-1.2.8.tar.gz
(23.8 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file IBITNormalizer-1.2.8.tar.gz.
File metadata
- Download URL: IBITNormalizer-1.2.8.tar.gz
- Upload date:
- Size: 23.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ce880d9d0b6b98a53431ee4ff84d1f09c76cc44e219da4c43bd4ccc156b35ccf
|
|
| MD5 |
924ffd617278127baaaac62d03670651
|
|
| BLAKE2b-256 |
045772aa2dc314229c0f8c8b03e08788afda174f686671fa485bec86c1ba6581
|
File details
Details for the file IBITNormalizer-1.2.8-py3-none-any.whl.
File metadata
- Download URL: IBITNormalizer-1.2.8-py3-none-any.whl
- Upload date:
- Size: 26.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.11.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e30fcd782ce0d155aa0a2d35b74e3d1c519bc72f519bd6b631fef539a3657f6
|
|
| MD5 |
05cc27c31ca969a28f4b6ce0483661cc
|
|
| BLAKE2b-256 |
5441958b15b0ecb886d3e6640dbc335ed11dd78f5059379bd5bb049b31294c51
|