Skip to main content

Scrapes pages like wikipedia for for urls, descriptions, and images

Project description

Page Scrapers

Description

Scrapes pages for for urls, descriptions, and images data.

Installation

pip install page-scrapers

or

pipenv install page-scrapers

Methods

  1. wikipedia.scrape_page(url) - Scrapes the page for data and returns it as a dictionary.
  2. wikipedia.scrape_pages(string, limit) - Finds URL's to pages and then scrapes each page for their data, limiting results to the number specified, or 10 if not provided.

Usage

from page_scrapers import wikipedia
url = "https://en.wikipedia.org/wiki/Batman_Begins"
scraped_data = wikipedia.scrape_page(url)
from page_scrapers import wikipedia
scraped_data = wikipedia.scrape_pages("american gangster film", 3)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
page_scrapers-1.1.0-py3-none-any.whl (3.8 kB) Copy SHA256 hash SHA256 Wheel py3
page_scrapers-1.1.0.tar.gz (7.2 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page