Cuts the tags and attributes from HTML that are not in the whitelist. Their content is leaves.
Project description
<h1>Python HTML purifier</h1>
<h2>About</h2>
<p>Cuts the tags and attributes from HTML that are not in the whitelist.
Their content is leaves. Signature of whitelist:
<code>python
{
'enabled tag name' : ['list of enabled tag\'s attributes']
}
</code>
You can use the symbol <code>*</code> to allow all tags and/or attributes.</p>
<p>Note that the <code>script</code> and <code>style</code> tags are removed with content.</p>
<p>The module is based on
<a href="http://docs.python.org/2/library/htmlparser.html">HTMLParser</a>
Class - in the standard Python package.
No need to pull a dependence, what is also sometimes can be a plus.</p>
<p><a href="http://pixxxxxel.blogspot.ru/2013/07/html-purifier-python.html">In my blog</a></p>
<h2>Basic Use</h2>
<p>```python</p>
<blockquote>
<blockquote>
<blockquote>
<p>purifier = HTMLPurifier({
'div': ['*'], # разрешает все атрибуты у тега div
'span': ['attr-2'], # разрешает только атрибут attr-2 у тега span
# все остальные теги удаляются, но их содержимое остается
})
print purifier.feed('<div class="e1" id="e1">Some <b>HTML</b> for <span attr-1="1" attr-2="2">purifying</span></div>')</p>
</blockquote>
</blockquote>
</blockquote>
<div class="e1" id="e1">Some HTML for <span attr-2="2">purifying</span></div>
<p>```</p>
<h2>About</h2>
<p>Cuts the tags and attributes from HTML that are not in the whitelist.
Their content is leaves. Signature of whitelist:
<code>python
{
'enabled tag name' : ['list of enabled tag\'s attributes']
}
</code>
You can use the symbol <code>*</code> to allow all tags and/or attributes.</p>
<p>Note that the <code>script</code> and <code>style</code> tags are removed with content.</p>
<p>The module is based on
<a href="http://docs.python.org/2/library/htmlparser.html">HTMLParser</a>
Class - in the standard Python package.
No need to pull a dependence, what is also sometimes can be a plus.</p>
<p><a href="http://pixxxxxel.blogspot.ru/2013/07/html-purifier-python.html">In my blog</a></p>
<h2>Basic Use</h2>
<p>```python</p>
<blockquote>
<blockquote>
<blockquote>
<p>purifier = HTMLPurifier({
'div': ['*'], # разрешает все атрибуты у тега div
'span': ['attr-2'], # разрешает только атрибут attr-2 у тега span
# все остальные теги удаляются, но их содержимое остается
})
print purifier.feed('<div class="e1" id="e1">Some <b>HTML</b> for <span attr-1="1" attr-2="2">purifying</span></div>')</p>
</blockquote>
</blockquote>
</blockquote>
<div class="e1" id="e1">Some HTML for <span attr-2="2">purifying</span></div>
<p>```</p>
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
html-purifier-0.1.2.zip
(87.1 kB
view hashes)