Skip to main content

A simple image scraper to download all images from a given url

Project description

ImageScraper
============
First python app :D
A simple python script which downloads all images in the given webpage.


Download
--------
tar file:
Grab the latest build using https://pypi.python.org/pypi/ImageScraper

pip install:
$pip install ImageScraper


Usage
-----
Using the tar file:

Extract the contents of the tar file.
Note that ``ImageScraper`` depends on ``lxml``. and ``requests``.
If you run into problems in the compilation of ``lxml`` through ``pip``, install the ``libxml2-dev`` and ``libxslt-dev`` packages on your system.


$cd ImageScraper/image_scraper/
$python __init__.py
$ Enter URL to scrap: https://github.com
$ Found 6 images:
$ How many images do you want ? : 6
$ Done.

If installed using pip:

Open python in terminal.

$python
>>>import image_scraper
Enter URL to scrap: https://github.com
Found 6 images:
How many images do you want ? : 6
Done.


NOTE:
A new folder called "images" will be created in the same place, containing all the downloaded images.

Issues
------

Q.)All images were not downloaded?
It could be that the content was injected into the page via javascript and this scraper doesn't run javascript.


Todo
----
Scraping sites which inject image tags via javascript using PhantomJS or Selenium.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ImageScraper-1.0.4.tar.gz (1.8 kB view details)

Uploaded Source

File details

Details for the file ImageScraper-1.0.4.tar.gz.

File metadata

  • Download URL: ImageScraper-1.0.4.tar.gz
  • Upload date:
  • Size: 1.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for ImageScraper-1.0.4.tar.gz
Algorithm Hash digest
SHA256 ca5478295e867446bd947ad7d82a5bef94b745b3d01f0d83c73c0e5b5f3f97c7
MD5 436bd6331edf93c0cbc84502a1161a3f
BLAKE2b-256 79c8595435060e3ed094f0cf093a50e6899088d073a9de73f478191a69f389ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page