Conservatively convert html to markdown
Project description
Experimental
Purpose: Converts html to markdown while preserving unsupported html markup. The goal is to generate markdown that can be converted back into html. This is the major difference between html2markdown and html2text. The latter doesn’t purport to be reversible.
Usage example
import html2markdown print html2markdown.convert('<h2>Test</h2><pre><code>Here is some code</code></pre>')
Output:
## Test Here is some code
Information and caveats
Attributes not supported by Markdown are kept
Example: <a href="http://myaddress" title="click me"><strong>link</strong></a>
Result: [__link__](http://myaddress "click me")
Example: <a onclick="javascript:dostuff()" href="http://myaddress" title="click me"><strong>link</strong></a>
Result: <a onclick="javascript:dostuff()" href="http://myaddress" title="click me">__link__</a> (the attribute onclick is not supported, so the tag is left alone)
Limitations
Tables are kept as html.
Changes
0.1.7:
Improved handling of inline tags.
Fix: Ignore <a> tags without an href attribute.
Improve escaping.
0.1.6: Added tests and support for Python versions below 2.7.
0.1.5: Fix Unicode issue in Python 3.
0.1.0: First version.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.