This package extracts/parses information from source HTML.
Project description
# HTML Parser
extracts/parses information from source HTML.
# construct a Pypi package
python3 setup.py sdist bdist_wheel
twine upload dist/*
# install package
python3 -m pip install htmlparsingbs4based
# create CLI from dist
python3 -m pip install /home/yaxiong/html_parsing/dist/htmlparsingbs4based-0.1.0.tar.gz
# run CLI (examples)
mode1: eleasticsearch
PARSE -i ‘http://www.mineracamargo.com/MCA_Investors.html’ -gpf elasticsearch -esusr readwrite -espw ‘’
mode2: local
PARSE -i ‘https://bryansfuel.on.ca/about/’ -f /home/yaxiong/data_crawled_websites/crawled_websites_first_batch
PARSE -i ‘http://www.mineracamargo.com/MCA_Investors.html’ -f /home/yaxiong/data_crawled_websites/crawled_websites_first_batch
PARSE -i ‘https://www.conpak.com/About-Conpak/’ -f /home/yaxiong/data_crawled_websites/crawled_websites_first_batch
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for htmlparsingbs4based-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 656750e85ba534e7252196dfb1614754303ae8c94f91986de761d1f65610b899 |
|
MD5 | 0b319c3f0f3f776d9f8ff84ef66e8aec |
|
BLAKE2b-256 | 842a6c31fae948e9ee139cd687a3c6ab3b287174027304fc6d1ca2ad8a7e18c0 |
Hashes for htmlparsingbs4based-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 133aa7c4e4c6f8c21551e42877f0bd40f23d1742a23eaa35ebf58c04c7e3bf0a |
|
MD5 | 80687d50254e1faf50ff96b7dbe8cc00 |
|
BLAKE2b-256 | 3cdd42da9f2819e4b5a7cdb0341a271c573e88a7b1cb6f98e421242261e26c3c |