Skip to main content

A simple package to fetch HTML data from a website

Project description

htmlExtractor

htmlExtractor is a simple Python package to fetch HTML data from a website using the requests and BeautifulSoup libraries. This package simplifies the process of fetching and parsing HTML content from a given URL.

Usage

Here's a quick example of how to use the htmlExtractor package:

Fetch and parse HTML data from a website

from htmlExtractor import extractHtml

soup = extractHtml('http://example.com', 'html.parser')

print(soup)

Details

htmlExtractor(website, parser)

  • Fetches and parses HTML data from the specified website.

Parameters:

  • website (str): The URL of the website to fetch HTML from.
  • parser (str): The parser to use with BeautifulSoup (e.g., 'html.parser', 'lxml').

Returns:

soup_data (BeautifulSoup object): Parsed HTML data.

Dependencies

This package requires the following libraries:

  • requests
  • beautifulsoup4 These dependencies will be installed automatically when you install the htmlExtractor package.

License:

This project is licensed under the MIT License.

Contact:

If you have any questions, suggestions, or issues, feel free to open an issue on GitHub or contact me directly at anisurrahman06046@gmail.com or LinkedIn .

Acknowledgements

  • This package uses the requests library to make HTTP requests.
  • It also uses BeautifulSoup for parsing HTML and XML documents.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

htmlExtractor-0.0.1.tar.gz (2.3 kB view details)

Uploaded Source

Built Distribution

htmlExtractor-0.0.1-py3-none-any.whl (2.4 kB view details)

Uploaded Python 3

File details

Details for the file htmlExtractor-0.0.1.tar.gz.

File metadata

  • Download URL: htmlExtractor-0.0.1.tar.gz
  • Upload date:
  • Size: 2.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.12.0

File hashes

Hashes for htmlExtractor-0.0.1.tar.gz
Algorithm Hash digest
SHA256 b42f051f2d262255b97dbb8a8a92ba3351e143b628768fb502899d6d8cbafc11
MD5 ecb95f968b4732bbcb491b8500c48918
BLAKE2b-256 882f0f08238977f55b4cd6d6175438e43cbc9a34c0d202a0c8d64104d395303d

See more details on using hashes here.

File details

Details for the file htmlExtractor-0.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for htmlExtractor-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 39cad6ed7b49e7d08e7a4fdf56b687291702c28bdf20429a053fcc3d88b1d771
MD5 a864f0c50c44f02a17652327c0a42776
BLAKE2b-256 4a1f11bd5987890099284daf148f563a1ce8e4798999a56e1c3a69de74beae02

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page