Skip to main content

Python bindings for the bluemonday HTML sanitizer

Project description

pybluemonday

PyPI

pybluemonday is a library for sanitizing HTML very quickly via bluemonday.

pybluemonday takes untrusted user generated content as an input, and will return HTML that has been sanitised against a whitelist of approved HTML elements and attributes so that you can safely include the content in your web page.

Note: This library is in a useable state but is still experimental. It may not have feature parity with the actual bluemonday and is likely to sanitize HTML slightly differently than other libraries. PRs and feedback are welcome on improving this library.

Installation

pip install pybluemonday

Examples

from pybluemonday import UGCPolicy
s = UGCPolicy()

print(s.sanitize("<script>alert(1)</script><b class='stuff'>testing</b>"))
# <b>testing</b>

s.AllowAttrs("class", "style").Globally()
print(s.sanitize("<script>alert(1)</script><b class='stuff' style='color: red;'>testing</b>"))
# <b class="stuff" style="color: red;">testing</b>
from pybluemonday import StrictPolicy
s = StrictPolicy()

s.sanitize("<center><b>Blog Post Title</b></center>")
# Blog Post Title

How does this work?

pybluemonday is a binding to bluemonday through a shared library built through cgo. However, instead of replicating the entire API, pybluemonday uses reflection on the Go side and some type checking on the Python side to call the right bluemonday function when you try to call a method.

Essentially you want to create a Policy with the provided pybluemonday.UGCPolicy, pybluemonday.StrictPolicy, and pybluemonday.NewPolicy classes and then call methods that map to the appropriate bluemonday struct method.

This is an open area of improvement but gets reasonable coverage of the original bluemonday interface.

Also because it's difficult to share Go structs over to Python, pybluemonday keeps an ID reference to the struct in the Go side and passes the reference for every Go call. This means that if you corrupt or change the ID for some nonsensical reason you may likely end up with a memory leak. This is also an open area of improvement.

Performance

Most Python based HTML sanitizing libraries will need to rely on html5lib for parsing HTML in a reasoanble way. Because of this you will likely see performance hits when using these libraries.

Since pybluemonday is just bindings for bluemonday it has very good performance because all parsing and processing is done in Go by bluemonday. Go also ships an HTML5 parser which means we avoid html5lib but still process HTML pretty well.

Always take benchmarks with a grain of salt but when compared to other similar Python sanitizing libraries pybluemonday executes far faster:

❯ python benchmarks.py
bleach (20000 sanitizations): 37.613802053
html_sanitizer (20000 sanitizations): 17.645683948
lxml Cleaner (20000 sanitizations): 10.500760227999997
pybluemonday (20000 sanitizations): 0.6188559669999876

Benchmarks taken on a MacBook Pro 15-inch, 2016 (2.7 GHz Intel Core i7, 16 GB RAM)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybluemonday-0.0.12.tar.gz (8.4 kB view hashes)

Uploaded Source

Built Distributions

pybluemonday-0.0.12-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-pp39-pypy39_pp73-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

pybluemonday-0.0.12-pp38-pypy38_pp73-win_amd64.whl (1.2 MB view hashes)

Uploaded PyPy Windows x86-64

pybluemonday-0.0.12-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-pp38-pypy38_pp73-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

pybluemonday-0.0.12-pp37-pypy37_pp73-win_amd64.whl (1.2 MB view hashes)

Uploaded PyPy Windows x86-64

pybluemonday-0.0.12-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-pp37-pypy37_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.1 MB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-pp37-pypy37_pp73-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded PyPy macOS 10.9+ x86-64

pybluemonday-0.0.12-cp311-cp311-musllinux_1_1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

pybluemonday-0.0.12-cp311-cp311-musllinux_1_1_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ i686

pybluemonday-0.0.12-cp311-cp311-musllinux_1_1_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

pybluemonday-0.0.12-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-cp311-cp311-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.11 macOS 11.0+ ARM64

pybluemonday-0.0.12-cp311-cp311-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pybluemonday-0.0.12-cp310-cp310-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 Windows x86-64

pybluemonday-0.0.12-cp310-cp310-musllinux_1_1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

pybluemonday-0.0.12-cp310-cp310-musllinux_1_1_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ i686

pybluemonday-0.0.12-cp310-cp310-musllinux_1_1_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

pybluemonday-0.0.12-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-cp310-cp310-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.10 macOS 11.0+ ARM64

pybluemonday-0.0.12-cp310-cp310-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pybluemonday-0.0.12-cp39-cp39-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 Windows x86-64

pybluemonday-0.0.12-cp39-cp39-musllinux_1_1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ x86-64

pybluemonday-0.0.12-cp39-cp39-musllinux_1_1_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ i686

pybluemonday-0.0.12-cp39-cp39-musllinux_1_1_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.9 musllinux: musl 1.1+ ARM64

pybluemonday-0.0.12-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-cp39-cp39-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.9 macOS 11.0+ ARM64

pybluemonday-0.0.12-cp39-cp39-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.9 macOS 10.9+ x86-64

pybluemonday-0.0.12-cp38-cp38-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 Windows x86-64

pybluemonday-0.0.12-cp38-cp38-musllinux_1_1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ x86-64

pybluemonday-0.0.12-cp38-cp38-musllinux_1_1_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ i686

pybluemonday-0.0.12-cp38-cp38-musllinux_1_1_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.8 musllinux: musl 1.1+ ARM64

pybluemonday-0.0.12-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-cp38-cp38-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-cp38-cp38-macosx_11_0_arm64.whl (1.1 MB view hashes)

Uploaded CPython 3.8 macOS 11.0+ ARM64

pybluemonday-0.0.12-cp38-cp38-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.8 macOS 10.9+ x86-64

pybluemonday-0.0.12-cp37-cp37m-win_amd64.whl (1.2 MB view hashes)

Uploaded CPython 3.7m Windows x86-64

pybluemonday-0.0.12-cp37-cp37m-musllinux_1_1_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ x86-64

pybluemonday-0.0.12-cp37-cp37m-musllinux_1_1_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ i686

pybluemonday-0.0.12-cp37-cp37m-musllinux_1_1_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.7m musllinux: musl 1.1+ ARM64

pybluemonday-0.0.12-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.1 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

pybluemonday-0.0.12-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64 manylinux: glibc 2.5+ x86-64

pybluemonday-0.0.12-cp37-cp37m-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.2 MB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ i686 manylinux: glibc 2.5+ i686

pybluemonday-0.0.12-cp37-cp37m-macosx_10_9_x86_64.whl (1.2 MB view hashes)

Uploaded CPython 3.7m macOS 10.9+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page