Skip to main content

Image scraper for DuckDuckGo for creating deep learning datasets

Project description

jmd_imagescraper

An image scraping library for creating deep learning datasets.

This library is for creating deep learning datasets.

It uses DuckDuckGo for the image scraping as they have return nice big images and have some rather nice parameters to make your life easier, for example we can filter the searches to only return square images which are photos.

jmd_imagescraper.core contains the main scraping/downloading functionality.
jmd_imagescraper.imagecleaner contains an image cleaner you can use from within your notebook to clean up the results and delete anything unsuitable.

Install

pip install jmd_imagescraper

How to use

from jmd_imagescraper.core import *
from pathlib import Path

root = Path().cwd()/"images"
duckduckgo_search(root, "Puppies", "cute puppies", max_results=10)
Duckduckgo search: cute puppies
Downloading results into C:\Users\Joe\Documents\GitHub\jmd_imagescraper\images\Puppies
<style> /* Turns off some styling */ progress { /* gets rid of default border in Firefox and Opera. */ border: none; /* Needs to be in here for Safari polyfill so background images work as expected. */ background-size: auto; } .progress-bar-interrupted, .progress-bar-interrupted::-webkit-progress-bar { background: #F44336; } </style> <progress value='10' class='' max='10' style='width:300px; height:20px; vertical-align: middle;'></progress> 100.00% [10/10 00:02<00:00 Images downloaded]
[WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/001.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/002.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/003.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/004.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/005.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/006.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/007.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/008.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/009.jpg'),
 WindowsPath('C:/Users/Joe/Documents/GitHub/jmd_imagescraper/images/Puppies/010.jpg')]
from jmd_imagescraper.imagecleaner import *
display_image_cleaner(root)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jmd_imagescraper-0.0.1.tar.gz (15.1 kB view hashes)

Uploaded source

Built Distribution

jmd_imagescraper-0.0.1-py3-none-any.whl (13.0 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page