A Python package to check if a text is informal Persian.
Project description
Persian-Informal-Text-Detector
Persian Informal Text Detector is a rule-based informal text detector based on regular expressions. It can be used to identify informal Persian text by detecting certain indicators such as informal words and verb formats.
Source of Informal Text Indicators
Some of the informal text indicators, such as informal words and verb formats, are derived from this Wikipedia page.
Installation
You can install Persian Informal Text Detector using pip:
pip install informal_detector
Example Usage
from informal_detector import is_informal
# Returns True since the text contains at least one informal indicator
result1 = is_informal("دلم میخواد برم خونه", threshold=1)
print(result1) # Output: True
# Returns False since the text does not contain enough informal indicators
result2 = is_informal("نباید به خانه بروم", threshold=1)
print(result2) # Output: False
The threshold Argument
The threshold
keyword argument is crucial as it indicates how strict the detector should be. It determines the number of informal Persian indicators, such as informal words and verbs, required to classify a text as informal.
A lower threshold is suitable for smaller text files, while a higher threshold is more appropriate for larger files where some formal sentences might exist but the text should still be marked as informal if it contain a significant number of informal indicators. A threshold of 1 means that a text is considered informal if it contains at least one informal word or verb.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file informal_detector-0.1.1.tar.gz
.
File metadata
- Download URL: informal_detector-0.1.1.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.10.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc70488c0e6ab968c18b4a26e1bfa09ada9ea505a2876bb10f9e12d31809f4a0 |
|
MD5 | 30a0ae62a221b8127898654790a6754f |
|
BLAKE2b-256 | d75d1c2169a49486cc61c6777677dc4c197995474f45965de99a440131d4d3f3 |