Convert HTML to plain text
Project description
htmltextconvert renders HTML to plain text, for example to autogenerate a plain text versions of HTML emails, or to index HTML documents for search.
It differs from other packages in these ways:
Pure Python, no dependencies
High quality, well tested code
Permissive license (Apache)
Renders the HTML to text suitable for an text/plain email body (it doesn’t aim to convert to a structured text format like markdown, but rather at giving a readable text-only representation of the rendered HTML).
Usage:
>>> import htmltextconvert >>> print( ... htmltextconvert.html_to_text( ... """ ... <p>This is a paragraph.</p> ... <p>This is another paragraph.</p> ... """ ... ) ... ) This is a paragraph This is another paragraph
htmltextconvert handles the following HTML tags:
Character entity references (&name;, &#nnnn;, &#xhhhh)
Unordered lists (<ul>)
Ordered lists (<ol>)
Paragraphs (<p>)
Block quotes (<blockquote>)
Linebreaks (<br>)
Links (<a href="…">)
Bold (<strong>)
Italic (<em>)
Code (<code>)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for htmltextconvert-0.1.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 305f3d815c3ec32f885aee45f5a0dbf0aaa53775121647f434592eafab355ce1 |
|
MD5 | e29db81882320d1d0c3dcdb2db97cb85 |
|
BLAKE2b-256 | 8c43493c26e96ac936305810a618ccf10fc5350c969f31c502eb3c141af82efe |