Skip to main content

A web bot to scrape images from websites.

Project description

A web bot to scrape images from websites.


  • Supported platforms: Linux / Python 2.7.
  • Uses scrapy web crawling framework.
  • Maintains a database of all downloaded images to avoid duplicate downloads.
  • Optionally, it can scrape only under a particular url, e.g. scraping “” with this option will only download from new album.
  • You can specify minimum image size to be downloaded.
  • Scrapes through javascript popup links.
  • Live monitor window for displaying images as they are scraped.


crawl commands:

  • Scrape images from

    imagebot crawl
  • Scrape images from while allowing images from a cdn such as (add multiple domains with comma separated list):

    imagebot crawl -d
  • Specify image store location:

    imagebot crawl -is /home/images
  • Specify minimum size of image to be downloaded (width x height):

    imagebot crawl -s 300x300
  • Stay under

    imagebot crawl -u
  • Launch monitor windows for live images:

    imagebot crawl -m
  • Set user-agent:

    imagebot crawl -a "my_imagebot("
  • For more options, get help:

    imagebot crawl -h

clear commands:

  • Clear cache:

    imagebot clear --cache
  • Remove image metadata from database:

    imagebot clear --db
  • Get help:

    imagebot clear -h


  1. python-gi (Python GObject Introspection API) (if using monitor UI)

    On Ubuntu:

    apt-get install python-gi
  2. scrapy (a powerful web crawling framework)

    It will be automatically installed by pip.

  3. Pillow (Python Imaging Library)

    It will be automatically installed by pip.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
imagebot-1.0.3.tar.gz (13.0 kB) Copy SHA256 hash SHA256 Source None Feb 3, 2015

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page