Skip to main content

A collection of tools to aid in web scraping.

Project description

Scrapetools

A collection of tools to aid in web scraping.

Install using:

pip install scrapetools

Scrapetools contains three functions (scrape_emails, scrape_phone_numbers, scrape_inputs) and one class (LinkScraper).

Basic usage

import scrapetools
import requests

url = 'https://somewebsite.com'
source = requests.get(url).text

emails = scrapetools.scrape_emails(source)

phoneNumbers = scrapetools.scrape_phone_numbers(source)

scraper = scrapetools.LinkScraper(source, url)
scraper.scrape_page()
# links can be accessed and filtered via the get_links() function
same_site_links = scraper.get_links(same_site_only=True)
same_site_image_links = scraper.get_links(link_type='img', same_site_only=True)
external_image_links = scraper.get_links(link_type='img', excluded_links=same_site_image_links)

# scrape_inputs() returns a tuple of BeautifulSoup Tag elements for various user input elements
forms, inputs, buttons, selects, text_areas = scrapetools.scrape_inputs(source)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapetools-1.1.9.tar.gz (7.8 kB view details)

Uploaded Source

Built Distribution

scrapetools-1.1.9-py3-none-any.whl (9.7 kB view details)

Uploaded Python 3

File details

Details for the file scrapetools-1.1.9.tar.gz.

File metadata

  • Download URL: scrapetools-1.1.9.tar.gz
  • Upload date:
  • Size: 7.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for scrapetools-1.1.9.tar.gz
Algorithm Hash digest
SHA256 11d9a694466f7054a2f87916f55c562348b9753ed45a4129ab10447fe5453dcc
MD5 9285fa4a2f8ce64f173a1d712552852f
BLAKE2b-256 b8c1d69cce44217659f00270a67df065c013ddf4601b4976685a395b89e389e7

See more details on using hashes here.

File details

Details for the file scrapetools-1.1.9-py3-none-any.whl.

File metadata

  • Download URL: scrapetools-1.1.9-py3-none-any.whl
  • Upload date:
  • Size: 9.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.0

File hashes

Hashes for scrapetools-1.1.9-py3-none-any.whl
Algorithm Hash digest
SHA256 03a4ff71507be0f5b402299120913777f6914c33347008367cf5c441b0ba05fa
MD5 3da01baff04de7189ea77b8086dc9a1b
BLAKE2b-256 551a3ce12edacb8df6ec1ca1d0f6921944dff2832fdf9a4870560e42b4f05d3d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page