Skip to main content

Tools for scraping

Project description

scraped

Tools for scraping

To install: pip install scraped

Examples

download_site

download_site('http://www.example.com')

will just download the page the url points to, storing it in the default rootdir, which, for example, on unix/mac, is ~/.config/scraped/data, but can be configured through a SCRAPED_DFLT_ROOTDIR environment variable.

The depth argument will enable you to download more content starting from the url:

download_site('http://www.example.com', depth=3)

And there's more arguments:

  • start_url: The URL to start downloading from.
  • url_to_filepath: The function to convert URLs to local filepaths.
  • depth: The maximum depth to follow links.
  • filter_urls: A function to filter URLs to download.
  • mk_missing_dirs: Whether to create missing directories.
  • verbosity: The verbosity level.
  • rootdir: The root directory to save the downloaded files.
  • extra_kwargs: Extra keyword arguments to pass to the Scrapy spider.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scraped-0.0.5.tar.gz (5.6 kB view hashes)

Uploaded Source

Built Distribution

scraped-0.0.5-py3-none-any.whl (6.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page