Skip to main content

Easily accept html input with Django models, templates and DRF serializers

Project description

Version: 1.1.0
Source:https://gitlab.routh.io/open-source/python/django_tidyfields
Keywords:django lxml input html fields
PythonVersion:3.6+

build-status Requirements status Coverage status

python-versions django-versions pypi-version pypi-downloads

Sanitise HTML input from API Endpoints or Views

1   Features

  • Leverages the power of lxml to filter model fields
  • Supports input from any source as the filtering is triggered by model save

2   Installation

2.1   Requirements

  • Python 3.6 or above
  • setuptools 30.3.0 or above
  • Django 2.0 or above

2.2   Install

Install django_tidyfields via pip:

pip install django-tidyfields

Add Django TidyFields to INSTALLED_APPS:

INSTALLED_APPS = [
    # ...
    'django_tidyfields',
    # ...
]

2.3   Configure

These fields pass the expected arguments for lxml.html.clean.Cleaner class through to the Cleaner instance directly. We will try to keep the docs in line with the latest integrated lxml version, however these parameters are subject to change based on the development of lxml. More on https://lxml.de

3   Usage

Django TidyFields subclass the Django TextField and CharField classes, and take any parameters available to them.

You can optionally configure the field globally in your Django Settings:

"""
Empty dict example, showing all parameters available, at their defaults.
"""
TIDYFIELDS = {
    'processing_instructions': True,
    'javascript': True,
    'comments': True,
    'style': True,
    'allow_tags': [],
    'remove_unknown_tags': False,
    'kill_tags': ['script', 'style'],
    'safe_attrs_only': True,
    'safe_attrs': [],
    'add_nofollow': True,
    'scripts': True,
    'inline_style': None,
    'links': True,
    'meta': True,
    'page_structure': True,
    'embedded': True,
    'frames': True,
    'forms': True,
    'annoying_tags': True,
    'remove_tags': None,
    'host_whitelist': [],
    'whitelist_tags': {}
}

And you can override specific parameters for each model that uses Django TidyFields. Parameters not set here will inherit from the global settings or from lxml.html.clean.Cleaner itself. Review the lxml documentation for the bleach default arguments.

models.py:

"""
A minimal Models.py usage example
"""

from django.db.models import Model
from django_tidyfields.fields import TidyTextField, TidyCharField

class UserSubmission(Model):
    title = TidyCharField()
    description = TidyTextField()
    body = TidyTextField()

4   Advanced Usage

Django TidyFields can be used however you like, but we recommend that your global defaults be a minimum allowed set of tags, or simply be setup to strip everything. If your project only allows HTML tags in certain TextFields for example, it implies that you’ll have a number of CharFields and TextFields where you want HTML to be stripped out.

You can define allowed tags when defining a field directly in the model, however you may also define addition defaults with unique variable names in your Django Settings, and use that var on any TextField that allows those tags. The fields check to see if any arguments are set in the field_args parameter, and only overrides the default arguments if you’ve passed the same argument again. So you can use additive and subtractive magic to simplify your code as much as possible. Just remember the Wizards Second Rule! (Especially when using subtractive magic)

“The Second Rule is that the greatest harm can result from the best intentions. It sounds a paradox, but kindness and good intentions can be an insidious path to destruction. Sometimes doing what seems right is wrong and can cause harm. The only counter to it is knowledge, wisdom, forethought, and understanding the First Rule. Even then, that is not always enough.”

– Zedd Zu’l Zorander
Stone of Tears, Terry Goodkind

4.1   An Additive example

settings.py:

"""
Default dict that strips all HTML, with a permissive dict for certain fields.
"""
TIDYFIELDS = {
    'processing_instructions': True,
    'javascript': True,
    'comments': True,
    'style': True,
    'allow_tags': [''],
    'remove_unknown_tags': False,
    'kill_tags': ['script', 'style'],
    'safe_attrs_only': True,
    'safe_attrs': [''],
    'add_nofollow': True
}

PERMISSIVE_TIDYFIELDS = {
    'allow_tags': ['b', 'em', 'i', 'strong', 'span', 'p', 'pagebreak'],
    'safe_attrs': ['style'],
    'style': False
}

models.py:

"""
A models.py usage example with Additive magic
"""

from django.db.models import Model
from django.conf import settings
from django_tidyfields.fields import TidyTextField, TidyCharField

class UserSubmission(Model):
    title = TidyCharField()
    description = TidyTextField()
    body = TidyTextField(field_args=settings.PERMISSIVE_TIDYFIELDS)

5   History

This module was originally named Django-Bleachfields and was intended to be a spiritual successor to the now defunct django-bleachfield module. An alpha version had been uploaded to Pypi, however it has been pulled in favour of this module. During initial testing it was found that bleach only removes tags, the developers considering removal of the code within them being a concern of beutifying HTML rather than a security concern. It was found that this opened the door for some of the more creative XSS filter attacks. As a result, lxml was chosen to replace bleach in this module as it allows the complete removal of specified tags and their content.

6   Testing

This module is tested to ensure it does not strip allowed HTML or CSS, but that it does strip XSS attacks or leaves them inert. Nearly 30 attacks from the OWASP XSS Filter Evasion cheat sheet are tested. More will be added in the next version.

Disclaimer: Allowing javascript will compromise the XSS filtering. Do so with utmost caution and only give such priveledges to trusted persons.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for django-tidyfields, version 1.1.0
Filename, size File type Python version Upload date Hashes
Filename, size django_tidyfields-1.1.0.tar.gz (14.6 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page