Bulk-download all thumbnails from an ImageNet synset, with optional rescaling
Project description
Command-line utility for downloading all thumbnail images from an ImageNet synset, optionally rescaling to a different resolution.
NOTICE: ImageNet downloads are currently offline. This is an upstream issue and out of my control. From the relevant announcement:
While conducting our study, since January 2019 we have disabled downloads of the full ImageNet data, except for the small subset of 1,000 categories used in the ImageNet Challenge. We are in the process of implementing our proposed remedies.
Usage
Usage: imagenetscraper [OPTIONS] SYNSET_ID [OUTPUT_DIR] Options: -c, --concurrency INTEGER Number of concurrent downloads (default: 8). -s, --size WIDTH,HEIGHT If specified, images will be rescaled to the given size. -q, --quiet Suppress progress output. -h, --help Show this message and exit. --version Show the version and exit.
If the URL of a synset page looks like:
http://image-net.org/synset?wnid=n00000000 ^^^^^^^^^ SYNSET_ID
SYNSET_ID is the n00000000 part. For example, for the “person, individual, someone, somebody, mortal, soul” synset at http://image-net.org/synset?wnid=n00007846, the corresponding synset id is n00007846.
The default output directory (OUTPUT_DIR) is the current directory.
Examples
To download all thumbnail imagess from the synset mentioned above to the directory “person_images”, run:
imagenetscraper n00007846 person_images
To do the same, but with each thumbnail image rescaled to a width of 256 and a height of 128, add --size 256,128:
imagenetscraper n00007846 person_images --size 256,128
To run in “quiet mode”, suppressing progress output, add --quiet:
imagenetscraper n00007846 person_images --size 256,128 --quiet
By default, imagenetscraper will download 8 images at once. To change this, use --concurrency:
imagenetscraper n00007846 person_images --size 256,128 --concurrency 4
Install
Install Python 3, pip, and a development version of libjpeg. imagenetscraper is tested with Python 3.4-3.7 and libjpeg-turbo 8.
sudo apt-get install python3 python3-pip libjpeg-turbo8-dev
Download and install imagenetscraper with pip.
sudo -H pip3 install imagenetscraper
Citation
If this tool helped with your research, a citation would be appreciated:
@Misc{imagenetscraper, author = {Michael Smith}, title = {imagenetscraper: Bulk-download thumbnails from ImageNet synsets}, howpublished = {\url{https://github.com/spinda/imagenetscraper}}, year = {2017} }
Testing
To run unit tests, use:
python3 setup.py test
License
Copyright (C) 2017-2018 Michael Smith <michael@spinda.net>
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file imagenetscraper-1.0.2.tar.gz
.
File metadata
- Download URL: imagenetscraper-1.0.2.tar.gz
- Upload date:
- Size: 16.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.20.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd1a411ea0a887d8b8d332f0750f982a995f922c5a0e688948fbec0dedd6f834 |
|
MD5 | 0cdbdd826f101e4f1496fdb85eee33c3 |
|
BLAKE2b-256 | 8f53a88fae64bddcb804898db5dd5c7ddf2b59a21dca0dc6e13009519cba4bdb |
File details
Details for the file imagenetscraper-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: imagenetscraper-1.0.2-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.20.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e2cfb4e04d3cd2f21bc3523f041d0f6cc53f8604af4940b8c48c1a6a24578101 |
|
MD5 | 9aad9896524bff06820574d7129cb187 |
|
BLAKE2b-256 | 5236ca8c812a77cd768f0ed31026e3ac452d15cc314f25c5466b75cb3954804a |