Skip to main content

Image scraper for DuckDuckGo for creating deep learning datasets

Project description

jmd_imagescraper

An image scraping library for creating deep learning datasets.

This library is for creating deep learning datasets.

It uses DuckDuckGo for the image scraping as they return nice big images and have some rather nice parameters to make your life easier, for example we can filter the searches to only return square images which are photos.

jmd_imagescraper.core contains the main scraping/downloading functionality.

jmd_imagescraper.imagecleaner contains an image cleaner you can use from within your notebook to clean up the results and delete anything unsuitable.

Install

pip install jmd_imagescraper

How to use

from jmd_imagescraper.core import * # dont't worry, it's designed to work with import *
from pathlib import Path

root = Path().cwd()/"images"

duckduckgo_search(root, "Cats", "cute kittens", max_results=20)
from jmd_imagescraper.imagecleaner import *

display_image_cleaner(root)

Docs

If you're reading this on pypi.org you can find the docs at https://joedockrill.github.io/jmd_imagescraper/

History

20/09/2020 add: PR from @butchland, add uuid to filenames, fix for users of fastai.vision.widgets.ImageClassifierCleaner
18/09/2020 rel: version 1 released as pypi package

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jmd_imagescraper-1.0.2.tar.gz (13.9 kB view details)

Uploaded Source

Built Distribution

jmd_imagescraper-1.0.2-py3-none-any.whl (12.7 kB view details)

Uploaded Python 3

File details

Details for the file jmd_imagescraper-1.0.2.tar.gz.

File metadata

  • Download URL: jmd_imagescraper-1.0.2.tar.gz
  • Upload date:
  • Size: 13.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for jmd_imagescraper-1.0.2.tar.gz
Algorithm Hash digest
SHA256 a1a8f203d05f4760a7da8022a47b8f59ac90a047c5e317aea890fbd57c01451e
MD5 39da1d6b68954277ac19b1e05fd7c388
BLAKE2b-256 7ca2192410f91f4ff9ae29ee007fdb79439e9d3e3d509dc5eee72f3aa2978964

See more details on using hashes here.

File details

Details for the file jmd_imagescraper-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: jmd_imagescraper-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 12.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.2.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.8.2

File hashes

Hashes for jmd_imagescraper-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 d6339693866bbc554bb399602aca49c85c19bf7d6b93fac6ded4dff2c06eefb9
MD5 5ec09472a4656ac57c73658301530bf9
BLAKE2b-256 c680b84f3cbce8a7f0491a16e8dc7d7b671539223c6ad686b21e4c26c353847f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page