Skip to main content

A Python library to find html in strings

Project description

markdown-html-finder PyPI

A Python library to locate HTML spans in markdown text. This library is written in Rust with bindings for Python.

why?

For a separate project I needed to locate HTML comments in markdown documents. Sadly the markdown parsers I found for Python didn't provide span information for nodes.

While it wouldn't be too hard to add some features to existing Python markdown parsers, I thought it would be interesting to see how Rust can be used from Python. The excellent pulldown-cmark crate provides span information for HTML elements, so that's what we use here.

pyo3 and maturin do the hard work of providing bindings to Python and building wheels to distribute on PyPi.

install

# poetry
poetry add markdown-html-finder

# pip
pip install markdown-html-finder

usage

from markdown_html_finder import find_html_positions

DOCUMENT = """\
# example markdown document

Amet nobis et numquam qui. Animi perferendis quia qui ut aut expedita. Ut eveniet quia quaerat.
<!-- hello world -->
Quisquam et et velit soluta quia.
"""

# NOTE: find_html_positions raises a ValueError if passed carriage returns `\r`
stripped_document = DOCUMENT.replace('\r', '')
html_positions = find_html_positions(stripped_document)
assert html_positions == [(125, 145)]

dev

# install build dependencies
poetry install

# build for python development
poetry run maturin development

building wheels

We need a wheel per version and platform. To support Python 3.7, 3.8, 3.9 we need to have 3.7, 3.8, 3.9 installed on macOS and Linux. For macOS we can use pyenv. For Linux we can use a Docker container.

macos

  1. install pyenv
  2. install each python version we want to support via pyenv install. Use pyenv install --list to see the available options.
  3. add your new Python installs globally via pyenv global 3.8.7 3.9.0
  4. configure your $PATH with the .pyenv python versions. use pyenv shims to find the binary paths and add them, like PATH=/Users/chris/.pyenv/shims/:$PATH
  5. verify your Python versions are accessible via python3.9 and verify maturin can find your python versions via ./.venv/bin/maturin list-python
  6. build the macOS wheels via ./.venv/bin/maturin build
  7. upload wheels to pypi via ./.venv/bin/twine upload --skip-existing target/wheels/*

linux

  1. use the docker container to build all the Linux Python wheels via docker run --rm -v $(pwd):/io cdignam/markdown-html-finder-builder:0.3.0 build --release
  2. upload wheels to pypi via ./.venv/bin/twine upload --skip-existing target/wheels/*

markdown-html-finder-builder

This container extends the quay.io/pypa/manylinux2014_x86_64 docker image and is based on the konstin2/maturin image, with Python2 support removed.

This image is built and uploaded manually to Docker Hub when necessary.

# build and publish a new version
VERSION='0.2.0'
docker build -f build.Dockerfile . --tag cdignam/markdown-html-finder-builder:$VERSION
docker push cdignam/markdown-html-finder-builder:$VERSION

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

markdown_html_finder-0.2.5.tar.gz (37.8 kB view details)

Uploaded Source

Built Distributions

markdown_html_finder-0.2.5-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.5+ x86-64

markdown_html_finder-0.2.5-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.5+ x86-64

markdown_html_finder-0.2.5-cp310-cp310-macosx_11_0_arm64.whl (660.4 kB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

markdown_html_finder-0.2.5-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.5+ x86-64

markdown_html_finder-0.2.5-cp39-cp39-macosx_11_0_arm64.whl (660.3 kB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

markdown_html_finder-0.2.5-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.5+ x86-64

markdown_html_finder-0.2.5-cp38-cp38-macosx_11_0_arm64.whl (660.1 kB view details)

Uploaded CPython 3.8 macOS 11.0+ ARM64

markdown_html_finder-0.2.5-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.5+ x86-64

markdown_html_finder-0.2.5-cp37-cp37m-macosx_11_0_arm64.whl (660.1 kB view details)

Uploaded CPython 3.7m macOS 11.0+ ARM64

File details

Details for the file markdown_html_finder-0.2.5.tar.gz.

File metadata

  • Download URL: markdown_html_finder-0.2.5.tar.gz
  • Upload date:
  • Size: 37.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.5

File hashes

Hashes for markdown_html_finder-0.2.5.tar.gz
Algorithm Hash digest
SHA256 12188883422f6342a78f8acf60e2c753e09946d00696b1f9ba86d8f532192f1c
MD5 184c8ee4e817665d41d72c3ecad82973
BLAKE2b-256 9e73d82a6056090900c05cd31f9a532742a63c365345beb836c9b719d270129b

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0a2637763a23d4bad98ad40e1d6563e28cea19faa9508dfaa69c9b8cf0ddda78
MD5 9290c78ea91ab1522dacdf2e120d4d4c
BLAKE2b-256 524605443089368a6739a9a56b8ecb2e21a129c7b2596a12325732326532e7d2

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 6ac8de530f5209de88a056287c3836266db02f2ad27e5f666e65dcb624852371
MD5 08b0cab31f391216aee35600e099e362
BLAKE2b-256 345119408d31f835486299d48e7f3f13097a2d3ef5cec52dd3fc231b69545439

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 c7b893f56650d0013ed1061cc883b4f67185b16113d02a3b96a688d47db34016
MD5 8f5768e611e9ce447efb0b70e93925b2
BLAKE2b-256 d22e214cd7959802bdf91be9efdc95818453b1454ac0025737794c0b676fec7c

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 86afbb199c139e7eef27e9ad134c0781867b54124461721db261bdb3bb82f6ce
MD5 4e9d309d9b85a2e4f9bf18de57088dc1
BLAKE2b-256 8866a61fb3c61b3cb1f15451fbf583a4651aaa5164846fc34d8cf4d08831bc5e

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fbecdd05ac013b27a1ed130cdbc9d0ff1fc367a9f242697f4c3897854440b877
MD5 6e427ee66c72d3a90ce8eddef503a5ae
BLAKE2b-256 dcadb2e165ac819614be3cd3ada316462a5e1628b112c3ae616473f4b2324eff

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2ea98e46c67747190620afeb5ccfffef3b8090df281465917d8d4a025a2c39cc
MD5 1928ac99bc5e40527bbd81fe30c79c44
BLAKE2b-256 4b96e12602d01057126eff0550309dff00293fc810111019141c5e9b6cc3a10c

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp38-cp38-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp38-cp38-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dfeca5ae00b1e5e19c1c4b93e63976c8279c98aca74d525f0cdf85c5501c3c03
MD5 63a19cf3ae41d9b2edbeb9bbc84ff6db
BLAKE2b-256 adec46557380115ed67cfe194af7b7b6884719f8b927c133171cb5fa8156f5e0

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a24e093b21dc699c638283a58598cbbb2489e029bcc8d90fddc608ddd8e78f9a
MD5 127644f1d79770906b351c73be603e1a
BLAKE2b-256 6c69b18e9743323836afc49751e38257dc2f01cb764bdd2d6bf6e21f0c92e1b6

See more details on using hashes here.

File details

Details for the file markdown_html_finder-0.2.5-cp37-cp37m-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for markdown_html_finder-0.2.5-cp37-cp37m-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5588c0dcba92c8425d5e7d17a57fe6db22a94d78313094b701b6b3a8ce1d2c23
MD5 4115d911c5008e113426754fe642426f
BLAKE2b-256 f13df5865fa11736a62dd93db7c81965953dc792eabbfa3d81b3e9db36a7dd94

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page