Skip to main content

simple and flexible html to markdown python converter pipeline with scrapy

Project description

Pyh2m for Scrapy Pipeline


settings.py:

...
# pyh2m settings for crawler project
RAW_TEXT = True  # raw text or markdown text
HTML_DICT_NAME = "html"        # item["html"]
MARK_DICT_NAME = "markdown"    # item["markdown"]
...

History

0.1.0

  • can use it work with scrapy

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapy-pyh2m-0.1.0.tar.gz (1.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapy_pyh2m-0.1.0-py2.py3-none-any.whl (1.5 kB view details)

Uploaded Python 2Python 3

File details

Details for the file scrapy-pyh2m-0.1.0.tar.gz.

File metadata

  • Download URL: scrapy-pyh2m-0.1.0.tar.gz
  • Upload date:
  • Size: 1.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.3

File hashes

Hashes for scrapy-pyh2m-0.1.0.tar.gz
Algorithm Hash digest
SHA256 498172474be3f2c4b1656073940475c439c294cc1c8e86d070a2d73c18909af8
MD5 ecddd24f8d1c5692c60bc69ef9d19ddf
BLAKE2b-256 596386f10e283e017a25c42db8962d8e2d0dae77b4df5c6f5d13a6096b1b00c1

See more details on using hashes here.

File details

Details for the file scrapy_pyh2m-0.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: scrapy_pyh2m-0.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 1.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.41.0 CPython/3.7.3

File hashes

Hashes for scrapy_pyh2m-0.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 81c05c61d8c352d340a043debfd44785919f6c19739f343da89ff86b0574266f
MD5 735801d0f0af1406c81ccbab51318437
BLAKE2b-256 b36ba2756f1b5353405ad796ae2b153d0686581d8de1845d2ff47c10a76d4d1b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page