simple and flexible html to markdown python converter pipeline with scrapy
Project description
Pyh2m for Scrapy Pipeline
settings.py:
...
# pyh2m settings for crawler project
RAW_TEXT = True # raw text or markdown text
HTML_DICT_NAME = "html" # item["html"]
MARK_DICT_NAME = "markdown" # item["markdown"]
...
History
0.1.0
- can use it work with scrapy
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapy-pyh2m-0.1.0.tar.gz
(1.9 kB
view hashes)
Built Distribution
Close
Hashes for scrapy_pyh2m-0.1.0-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81c05c61d8c352d340a043debfd44785919f6c19739f343da89ff86b0574266f |
|
MD5 | 735801d0f0af1406c81ccbab51318437 |
|
BLAKE2b-256 | b36ba2756f1b5353405ad796ae2b153d0686581d8de1845d2ff47c10a76d4d1b |