Skip to main content

Transform your HTML into clean, easy-to-read markdown with pyhtml2md.

Project description

pyhtml2md

pyhtml2md provides a way to use the html2md C++ library in Python. html2md is a fast and reliable library for converting HTML content into markdown.

[TOC]

Installation

You can install using pip:

pip3 install pyhtml2md

Basic usage

Here is an example of how to use the pyhtml2md to convert HTML to markdown:

import pyhtml2md

markdown = pyhtml2md.convert("<h1>Hello, world!</h1>")
print(markdown)

The convert function takes an HTML string as input and returns a markdown string.

Advanced usage

pyhtml2md provides a Options class to customize the generation process.
You can find all information on the c++ documentation

Here is an example:

import pyhtml2md

options = pyhtml2md.Options()
options.splitLines = False

converter = pyhtml2md.Converter("<h1>Hello Python!</h1>", options)
markdown = converter.convert()
print(markdown)
print(converter.ok())

Supported Tags

pyhtml2md supports the following HTML tags:

Tag Description Comment
a Anchor or link Supports the href, name and title attributes.
b Bold
blockquote Indented paragraph
br Line break
cite Inline citation Same as i.
code Code
dd Definition data
del Strikethrough
dfn Definition Same as i.
div Document division
em Emphasized Same as i.
h1 Level 1 heading
h2 Level 2 heading
h3 Level 3 heading
h4 Level 4 heading
h5 Level 5 heading
h6 Level 6 heading
head Document header Ignored.
hr Horizontal line
i Italic
img Image Supports src, alt, title attributes.
li List item
meta Meta-information Ignored.
ol Ordered list
p Paragraph
pre Preformatted text Works only with code.
s Strikethrough Same as del.
span Grouped elements Does nothing.
strong Strong Same as b.
table Table Tables are formatted!
tbody Table body Does nothing.
td Table data cell Uses align from th.
tfoot Table footer Does nothing.
th Table header cell Supports the align attribute.
thead Table header Does nothing.
title Document title Same as h1.
tr Table row
u Underlined Uses HTML.
ul Unordered list

License

pyhtml2md is licensed under The MIT License (MIT)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyhtml2md-1.5.4.tar.gz (223.2 kB view details)

Uploaded Source

File details

Details for the file pyhtml2md-1.5.4.tar.gz.

File metadata

  • Download URL: pyhtml2md-1.5.4.tar.gz
  • Upload date:
  • Size: 223.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pyhtml2md-1.5.4.tar.gz
Algorithm Hash digest
SHA256 492bb445baec22ed0fb205eec790d33ba4b077fa84efca6de6aa3837104131fd
MD5 6448a663b738a384716b7ad775ec4d9a
BLAKE2b-256 68e6689a73bd4df193595761fd604aaaeec744f78f29c4afeeaee3bcf814bd01

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page