Skip to main content

This package extracts/parses information from source HTML.

Project description

# HTML Parser

extracts/parses information from source HTML.

# construct a Pypi package

  • python3 setup.py sdist bdist_wheel

  • twine upload dist/*

# create CLI from dist (if you has .dist file)

  • python3 -m pip install /home/yaxiong/html_parsing/dist/htmlparsingbs4based-1.1.0.tar.gz

# install package and CLI

  • pip install htmlparsingbs4based

  • OR python3 -m pip install htmlparsingbs4based

# run from script

  • from htmlparsingbs4based.html_parsing.html_parser_custombs4_script import parse_single_page

  • parse_single_page(input_url=’https://bryansfuel.on.ca/about/’, path_to_crawled_files=’/home/yaxiong/data_crawled_websites/crawled_websites_first_batch’, min_length=1, prefix=””)

# run CLI (examples)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htmlparsingbs4based-1.1.0.tar.gz (56.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

htmlparsingbs4based-1.1.0-py3-none-any.whl (72.5 kB view details)

Uploaded Python 3

File details

Details for the file htmlparsingbs4based-1.1.0.tar.gz.

File metadata

  • Download URL: htmlparsingbs4based-1.1.0.tar.gz
  • Upload date:
  • Size: 56.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.0

File hashes

Hashes for htmlparsingbs4based-1.1.0.tar.gz
Algorithm Hash digest
SHA256 9b7b9cebb0be84fab1358213e760f6598038aec5d50ebf987d4e0261e8ce2ff8
MD5 4667ddefdf38b65b0137426fcc1edd90
BLAKE2b-256 8aaf5ef26eec33ebb2ef690026c6b96a8f536ef2379076459334cb7df1f17653

See more details on using hashes here.

File details

Details for the file htmlparsingbs4based-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for htmlparsingbs4based-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bd63e5250acfb85e0088faa75965d6dc481104e3d3b92423aad9b192b7f856e4
MD5 f5a6e0cc4c5c47bd3caa8c674ee74d7d
BLAKE2b-256 bd81534aa32f4d8e0de77fdee3da28f80028c9d49db09c4b5d89f60ef8d2001f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page