Skip to main content

Command line tool for downloading images from web page.

Project description

imgetr

A command line tool to help download images from a web page.

Examples

Download all of the nhl logos from the fox website.

  • Output the images into the directory ./nhl_logos using the -o flag.
  • Select img elements using -t flag
  • Select img elements containing the class .image-logo using the -c flage.
  • The image urls all have a similar format like Coyotes.vresize.72.72.medium.0.png, so lets rename them to a format like Coyotes.png using the -r flag and passing a regex selecting two groups. The first group up to the first "." and the second group selecting the file extension "([^.]+).+(\\..+$)".
  • Finally lets pass the -v flag to print out each filename to the command line as they download.
imgetr https://www.foxsports.com/nhl/teams -o ./nhl_logos -t img -c "image-logo" -r "([^.]+).+(\\..+$)" -v

running the above produces the following output:

[✓] NHL.png
[✓] Ducks.png
[✓] Coyotes.png
[✓] Bruins.png
[✓] Sabres.png
[✓] Flames.png
[✓] Hurricanes.png
[✓] Blackhawks.png
[✓] Avalanche.png
[✓] BlueJackets.png
[✓] Stars.png
[✓] RedWings.png
[✓] Oilers.png
[✓] Panthers.png
[✓] Kings.png
[✓] Wild.png
[✓] Canadiens.png
[✓] Predators.png
[✓] Devils.png
[✓] Islanders.png
[✓] Rangers.png
[✓] Senators.png
[✓] Flyers.png
[✓] Penguins.png
[✓] Sharks.png
[✓] Kraken.png
[✓] Blues.png
[✓] Lightning.png
[✓] MapleLeafs.png
[✓] Canucks.png
[✓] GoldenKnights.png
[✓] Capitals.png
[✓] Jets.png
[====================] 100% Complete.

Help

usage: imgetr [-h] [-o [OUTPUT_DIR]]                                                                                                    
                                  [-c class [class ...]] [-t [tag [tag ...]]]
                                  [-q [query_key]] [-u [.ext]] [-r [regex]]
                                  [-v]
                                  website

Download images from a website.

positional arguments:
  website               the website to download images from

optional arguments:
  -h, --help            show this help message and exit
  -o [OUTPUT_DIR], --output_dir [OUTPUT_DIR]
                        the directory to download the files into
  -c class [class ...], --class_list class [class ...]
                        list of css classes the images on the page contains
  -t [tag [tag ...]], --tag [tag [tag ...]]
                        list of html tags where images are contained
  -q [query_key], --query_key [query_key]
                        query key in an img src that contains image filename
  -u [.ext], --unknown_img_ext [.ext]
                        name of ext for images with no ext
  -r [regex], --rename [regex]
                        regex pattern selecting groups of the output image
                        filename to be concat together
  -v, --verbose         print out the name of each file downloaded

TODO

  • [] add option for selenium (headless) to download images which load on the page with javascript.
  • [] add testing
  • [] add more examples

Working on imgetr

To create conda environment:

conda env create -f environment.yml

To remove conda environment:

conda remove --name imgetr --all

To update requirements.txt:

conda env create -f environment.yml
pip freeze > requirements.txt

Before publishing anything

pip install --upgrade build
python -m build

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imgetr-1.0.0.tar.gz (4.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

imgetr-1.0.0-py3.7.egg (7.0 kB view details)

Uploaded Egg

imgetr-1.0.0-py3-none-any.whl (5.4 kB view details)

Uploaded Python 3

File details

Details for the file imgetr-1.0.0.tar.gz.

File metadata

  • Download URL: imgetr-1.0.0.tar.gz
  • Upload date:
  • Size: 4.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10

File hashes

Hashes for imgetr-1.0.0.tar.gz
Algorithm Hash digest
SHA256 cf60dd1a2f709ccfcb2c768e5b5db7c4d2b37709a3c9da7215d4289a2146b695
MD5 94ad3e4ee2393f22779bb2c8d8a5e69d
BLAKE2b-256 fccd0d4899faf70607f37ebc9c6a57c232a7a036aac45e688f693434ccfeef49

See more details on using hashes here.

File details

Details for the file imgetr-1.0.0-py3.7.egg.

File metadata

  • Download URL: imgetr-1.0.0-py3.7.egg
  • Upload date:
  • Size: 7.0 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10

File hashes

Hashes for imgetr-1.0.0-py3.7.egg
Algorithm Hash digest
SHA256 afd9c1179920cebdff659656bffafc0179229d377738ce878fe124514cd58524
MD5 99e75a0395a111c28bc6de1f754e0c19
BLAKE2b-256 701e383de90d918b6d89264abdb9453ec54f4c7ec0b6858952ccfc2f875202a2

See more details on using hashes here.

File details

Details for the file imgetr-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: imgetr-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 5.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10

File hashes

Hashes for imgetr-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b82bd305f278c946469cc7eae6905d6984294332bf821974d4623b3821d7a847
MD5 557a7c20185c243671026696a7799258
BLAKE2b-256 31ebf3c32d2d71a5d4e3b52ccbb05328827679b30f221fe0d60c6cffe9bdd856

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page