Skip to main content

Persian normalizer for text processing

Project description

Pormalizer

There are different unicode for lots of persian characters and for computers do not understand this. So before any NLP task first we need to normalize our text and come to singular form for any characters. We also remove any non-alphabet characters and all change all white-space characters into a single space.

Installation

Simply you can install it from PyPi by following command:

pip install -U pormalizer

or if you prefer the latest development version, you can install it from the source:

git clone https://github.com/xurvan/pormalizer.git
cd pormalizer
python setup.py install

Quickstart

A very simple usage could be like:

from pormalizer import Pormalizer

pormalizer = Pormalizer()

pormalizer.normalize("متن امتحانی")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pormalizer-0.1.0.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

pormalizer-0.1.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file pormalizer-0.1.0.tar.gz.

File metadata

  • Download URL: pormalizer-0.1.0.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5

File hashes

Hashes for pormalizer-0.1.0.tar.gz
Algorithm Hash digest
SHA256 86c9f5818b9a2a65c79ed7c89b62955e4f6661af2a0afbd2877236f31e6ab39d
MD5 9ece367a1e61ec8a1c85a1b9cef00655
BLAKE2b-256 442c41222d9fb378b862fd203c9211a37735661f43f3c5695cc202dd6032590e

See more details on using hashes here.

File details

Details for the file pormalizer-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: pormalizer-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5

File hashes

Hashes for pormalizer-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3861e4062b00c0da62852deb171f6576ca08e338aa62d0833dda71382d2499d0
MD5 481911fe7c643d68568d25b356b41dd7
BLAKE2b-256 69c077a66e63f070a3132c1b82575f30f4071bd2c29fc447a64edd94090077ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page