Scrapes pages like wikipedia for for urls, descriptions, and images
Project description
Page Scrapers
Description
Scrapes pages for for urls, descriptions, and images data.
Installation
pip install page-scrapers
or
pipenv install page-scrapers
Methods
- wikipedia.scrape_page(url) - Scrapes the page for data and returns it as a dictionary.
- wikipedia.scrape_pages(string, limit) - Finds URL's to pages and then scrapes each page for their data, limiting results to the number specified, or 10 if not provided.
Usage
from page_scrapers import wikipedia
url = "https://en.wikipedia.org/wiki/Batman_Begins"
scraped_data = wikipedia.scrape_page(url)
from page_scrapers import wikipedia
scraped_data = wikipedia.scrape_pages("american gangster film", 3)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
page_scrapers-1.1.0.tar.gz
(7.2 kB
view hashes)
Built Distribution
Close
Hashes for page_scrapers-1.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | acadb1bdeccbdb926637838406cb3c125cc64b65176484e5d06e0acd47bba8b8 |
|
MD5 | 65c1855867cb0b38706fda5bfff28331 |
|
BLAKE2b-256 | 95427f7557df923b5a4b8bcac01793b45c99295963e2b093aa2f0c340550a328 |