An open-source Python library for data cleaning tasks. Includes profanity detection, and removal. Now includes offensive language and hate speech detection using an AI model.

These details have not been verified by PyPI

Project links

Homepage

Project description

ValX

Python Version Downloads License Compliance PyPI Version

An open-source Python library for data cleaning tasks. It includes functions for profanity detection, and removal, and detection and removal of personal information. Also includes hate speech and offensive language detection and removal, using AI.

[!IMPORTANT] Please downgrade to numpy version 1.26.4. Our ValX DecisionTreeClassifier AI model, relies on lower versions of numpy, because it was trained on these versions. For more information see: https://techoverflow.net/2024/07/23/how-to-fix-numpy-dtype-size-changed-may-indicate-binary-incompatibility-expected-96-from-c-header-got-88-from-pyobject/

Changes in 0.2.4

Fixed a major incompatibility issue with scikit-learn due to version changes in scikit-learn v1.3.0 which causes compatibility issues with versions later than 1.2.2. ValX can now be used with scikit-learn versions earlier and later than 1.3.0!

We've also removed scikit-learn==1.2.2 as a dependency, as most versions of scikit-learn will now work.

Changes in 0.2.3

We have introduced a new optional info_type parameter into our detect_sensitive_information, and remove_sensitive_information functions, to allow you to have fine-grained control over what sensitive information you want to detect or remove.

Also introduced more detection patterns for other types of sensitive information, including:

"iban": International Bank Account Number.
"mrn": Medical Record Number (may not work correctly, depending on provider and country).
"icd10": International Classification of Diseases, Tenth Revision.
"geo_coords": Geo-coordinates (latitude and longitude in decimal degrees format).
"username": Username handles (@username).
"file_path": File paths (general patterns for both Windows and Unix paths).
"bitcoin_wallet": Cryptocurrency wallet address.
"ethereum_wallet": Cryptocurrency wallet addresses.

Changes in 0.2.2

We have refactored and changed the detect_profanity function:

Removed unnecessary printing
Now returns more information about each found profanity, including Line, Column, Word, and Language.

[!NOTE] You can view ValX's package documentation for more information on changes.

Changes in 0.2.1

Using the AI models in ValX, you can now automatically remove hate speech, or offensive speech from your text data, without needing to run detection and write your own custom implementation method.

Installation

You can install ValX using pip:

pip install valx

Supported Python Versions

ValX supports the following Python versions:

Python 3.6
Python 3.7
Python 3.8
Python 3.9
Python 3.10
Python 3.11/Later (Preferred)

Please ensure that you have one of these Python versions installed before using ValX. ValX may not work as expected on lower versions of Python than the supported.

Features

Profanity Detection: Detect profane and NSFW words or terms.
Remove Profanity: Remove profane and NSFW words or terms.
Detect Sensitive Information: Detect sensitive information in text data.
Remove Sensitive Information: Remove sensitive information from text data.
Detect Hate Speech: Detect hate speech or offensive speech in text, using AI.
Remove Hate Speech: Remove hate speech or offensive speech in text, using AI.

List of supported languages for profanity detection and removal

Below is a complete list of all the available supported languages for ValX's profanity detection and removal functions which are valid values for language:

All
Arabic
Czech
Danish
German
English
Esperanto
Persian
Finnish
Filipino
French
French (CA)
Hindi
Hungarian
Italian
Japanese
Kabyle
Korean
Dutch
Norwegian
Polish
Portuguese
Russian
Swedish
Thai
Klingon
Turkish
Chinese

Usage

Detect Profanity

from valx import detect_profanity

# Detect profanity
results = detect_profanity(sample_text, language='English')
print("Profanity Evaluation Results", results)

Remove Profanity

from valx import remove_profanity

# Remove profanity
removed = remove_profanity(sample_text, "text_cleaned.txt", language="English")

Detect Sensitive Information

from valx import detect_sensitive_information

# Detect sensitive information
detected_sensitive_info = detect_sensitive_information(sample_text)

[!NOTE] We have updated this function, and it now includes an optional argument for info_type, which can be used to detect only specific types of sensitive information. It was also added to remove_sensitive_information.

Remove Sensitive Information

from valx import remove_sensitive_information

# Remove sensitive information
cleaned_text = remove_sensitive_information(sample_text2)

Detect Hate Speech And Offensive Language

from valx import detect_hate_speech

# Detect hate speech or offensive language
outcome_of_detection = detect_hate_speech("You are stupid.")

[!IMPORTANT] The model's possible outputs are:

['Hate Speech']: The text was flagged and contained hate speech.

['Offensive Speech']: The text was flagged and contained offensive speech.

['No Hate and Offensive Speech']: The text was not flagged for any hate speech or offensive speech.

[!NOTE] See our official documentation for more examples on how to use ValX.

Contributing

Contributions are welcome! If you encounter any issues, have suggestions, or want to contribute to ValX, please open an issue or submit a pull request on GitHub.

License

ValX is released under the terms of the MIT License (Modified). Please see the LICENSE file for the full text.

Derived licenses

Creative Commons Attribution 4.0 International License: https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words/blob/master/LICENSE

Modified License Clause

The modified license clause grants users the permission to make derivative works based on the ValX software. However, it requires any substantial changes to the software to be clearly distinguished from the original work and distributed under a different name.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.2.6

Jan 28, 2026

0.2.5

Jul 3, 2025

This version

0.2.4

Nov 28, 2024

0.2.3

Nov 11, 2024

0.2.2

Oct 11, 2024

0.2.1

Sep 26, 2024

0.2.0

Jul 3, 2024

0.1.9

Jun 13, 2024

0.1.8

Jun 13, 2024

0.1.7

Jun 13, 2024

0.1.6

Apr 13, 2024

0.1.5

Apr 13, 2024

0.1.3

Apr 11, 2024

0.1.2

Apr 11, 2024

0.1.1

Apr 11, 2024

0.1.0

Apr 11, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

valx-0.2.4.tar.gz (409.8 kB view details)

Uploaded Nov 28, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

valx-0.2.4-py3-none-any.whl (406.0 kB view details)

Uploaded Nov 28, 2024 Python 3

File details

Details for the file valx-0.2.4.tar.gz.

File metadata

Download URL: valx-0.2.4.tar.gz
Upload date: Nov 28, 2024
Size: 409.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for valx-0.2.4.tar.gz
Algorithm	Hash digest
SHA256	`0ea9205fd16226e6a64e7b4e45973f2bd1515830f03d9b5c9f4eb14ef222f27f`
MD5	`ff33bc82b181284fb8d5de70b9a0ebf5`
BLAKE2b-256	`c9648c2cc88fdee16bb3ff018a5cc220446e85d0f501a9b3aa90581d48d5f34c`

See more details on using hashes here.

File details

Details for the file valx-0.2.4-py3-none-any.whl.

File metadata

Download URL: valx-0.2.4-py3-none-any.whl
Upload date: Nov 28, 2024
Size: 406.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for valx-0.2.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f05f94f7213470ba98df36f847c5c404aa6dc099e729f63026b0c5970cc875e6`
MD5	`2d5d9617974fb120d03f756725ce66d8`
BLAKE2b-256	`9d25dad79b5ecc9288927a7764f61261489a632bff357c85e5f066772a028f06`

See more details on using hashes here.

valx 0.2.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ValX

Changes in 0.2.4

Changes in 0.2.3

Changes in 0.2.2

Changes in 0.2.1

Installation

Supported Python Versions

Features

List of supported languages for profanity detection and removal

Usage

Detect Profanity

Remove Profanity

Detect Sensitive Information

Remove Sensitive Information

Detect Hate Speech And Offensive Language

Contributing

License

Derived licenses

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes