Skip to main content

RegXon is a powerful validator, sanitizer and content parser that you're searching for decades.

Project description

RegXon

RegXon is a powerful validator, sanitizer and content parser that you're searching for decades.

Installation

pip install regxon

Usage

from regxon.common import Regxon
regxon = Regxon()

General Validation

General validation includes email, domain, url and ipv4.

Validate Email

from regxon.common import Regxon

regxon = Regxon()
regxon.is_email('xyz@.com')  # None
regxon.is_email('xyz@cpx.com')  # returns a proper Match object; you can grab the match with `.string`

Validate Domain

from regxon.common import Regxon

regxon = Regxon()
regxon.is_domain('xyzcom')  # None
regxon.is_domain('xyz.com')  # returns a proper Match object; you can grab the match with `.string`

Validate URL

from regxon.common import Regxon

regxon = Regxon()
regxon.is_url('xyz.com')  # None
regxon.is_url('https://xyz.com')  # returns a proper Match object

Validate HTTP URL

from regxon.common import Regxon

regxon = Regxon()
regxon.is_http_url('xyz.com')  # None; returns None if the url is not http
regxon.is_http_url('ftp://xyz.com')  # None; returns None if the url is not http
regxon.is_http_url('http://django.c') # None; returns None because `.c` is not a valid domain 
regxon.is_http_url('https://xyz.com')  # returns a proper Match object; you can grab the match with `.string`

Validate IP

from regxon.common import Regxon

regxon = Regxon()

# 1, 2 both are same and return a proper Match, as default schema is ""
regxon.is_ipv4('127.0.0.1')                 # 1
regxon.is_ipv4('127.0.0.1', schema='')      # 2; matches because 127.0.0.1 has no schema

regxon.is_ipv4('http://127.0.0.1')  # returns None as schema is not matched; "http" != ""
regxon.is_ipv4('http://127.0.0.1', schema='')  # returns None as schema is not matched; "http" != ""

regxon.is_ipv4('http://127.0.0.1', schema='http')  # returns a proper Match
regxon.is_ipv4('https://127.0.0.1', schema='http')  # returns None as schema is not matched; "https" != "http"

Validate Phone Number

from regxon.common import Regxon

regxon = Regxon()
regxon.is_phone('+91 1234567890')  # returns a proper Match object; you can grab the match with `.string`

HTML Sanitization and Validation

RegXon provides a powerful HTML sanitizer and validator that you're searching for decades. It's a combination of html5lib and beautifulsoup4.

You "how to remove an attribute from HTML tag" problem is solved now. Or another problem of "how to remove a tag from HTML" is also solved.

from regxon.html import RegxonHTML

regxon_html = RegxonHTML()
html_content = """
<img onload="alert(1)" onerror="hey" src="http://example.com" />
<script>alert(1)</script>
"""
html = regxon_html.get_sanitized_content(html_content)

print(html)

The above code will print the following output

<img onerror="hey"/>

Add custom excluded attributes for any tag you want

from regxon.html import RegxonHTML

regxon_html = RegxonHTML()
html_content = """
<img onload="alert(1)" onerror="hey" src="http://example.com" />
<script>alert(1)</script>
"""

# Add custom excluded attributes for any tag you want
regxon_html.excluded_attributes.update({
    'img': regxon_html.excluded_attributes['img'] + ['onerror'],
})

The above code will print the following output

<img/>

Purpose of RegXon

  • Sanitize HTML; remove unwanted tags and attributes; XSS prevention
  • Validate IP, URL, Domain; SSRF prevention
  • Validate Email; Email spoofing prevention
  • Validate Phone Number; Phone number spoofing prevention

License

MIT

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Authors

Acknowledgements

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

regxon-0.0.9-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file regxon-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: regxon-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 4.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.9

File hashes

Hashes for regxon-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 4bac3f9b20994cc383ec423696e18e3073a9eb48bc382fdd2a846361adf9de5a
MD5 27355777f0b40ca1169556e458e243c2
BLAKE2b-256 966142d5d87ecf8a898c25895aa28534450f41645b47fb065e5a949a462d3da0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page