Cuts the tags and attributes from HTML that are not in the whitelist. Their content is leaves.
Project description
Python HTML purifier
====================
About
-----
Cuts the tags and attributes from HTML that are not in the whitelist.
Their content is leaves. Signature of whitelist:
```python
{
'enabled tag name' : ['list of enabled tag\'s attributes']
}
```
You can use the symbol ``*`` to allow all tags and/or attributes.
Note that the ``script`` and ``style`` tags are removed with content.
The module is based on
[HTMLParser](http://docs.python.org/2/library/htmlparser.html)
Class - in the standard Python package.
No need to pull a dependence, what is also sometimes can be a plus.
[In my blog](http://pixxxxxel.blogspot.ru/2013/07/html-purifier-python.html)
Basic Use
---------
```python
>>> purifier = HTMLPurifier({
'div': ['*'], # разрешает все атрибуты у тега div
'span': ['attr-2'], # разрешает только атрибут attr-2 у тега span
# все остальные теги удаляются, но их содержимое остается
})
>>> print purifier.feed('<div class="e1" id="e1">Some <b>HTML</b> for <span attr-1="1" attr-2="2">purifying</span></div>')
<div class="e1" id="e1">Some HTML for <span attr-2="2">purifying</span></div>
```
====================
About
-----
Cuts the tags and attributes from HTML that are not in the whitelist.
Their content is leaves. Signature of whitelist:
```python
{
'enabled tag name' : ['list of enabled tag\'s attributes']
}
```
You can use the symbol ``*`` to allow all tags and/or attributes.
Note that the ``script`` and ``style`` tags are removed with content.
The module is based on
[HTMLParser](http://docs.python.org/2/library/htmlparser.html)
Class - in the standard Python package.
No need to pull a dependence, what is also sometimes can be a plus.
[In my blog](http://pixxxxxel.blogspot.ru/2013/07/html-purifier-python.html)
Basic Use
---------
```python
>>> purifier = HTMLPurifier({
'div': ['*'], # разрешает все атрибуты у тега div
'span': ['attr-2'], # разрешает только атрибут attr-2 у тега span
# все остальные теги удаляются, но их содержимое остается
})
>>> print purifier.feed('<div class="e1" id="e1">Some <b>HTML</b> for <span attr-1="1" attr-2="2">purifying</span></div>')
<div class="e1" id="e1">Some HTML for <span attr-2="2">purifying</span></div>
```
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
html-purifier-0.1.1.zip
(86.1 kB
view hashes)