Command line tool for downloading images from web page.
Project description
imgetr
A command line tool to help download images from a web page.
Examples
Download all of the nhl logos from the fox website.
- Output the images into the directory
./nhl_logosusing the-oflag. - Select
imgelements using-tflag - Select
imgelements containing the class.image-logousing the-cflage. - The image urls all have a similar format like
Coyotes.vresize.72.72.medium.0.png, so lets rename them to a format likeCoyotes.pngusing the-rflag and passing a regex selecting two groups. The first group up to the first "." and the second group selecting the file extension"([^.]+).+(\\..+$)". - Finally lets pass the
-vflag to print out each filename to the command line as they download.
imgetr https://www.foxsports.com/nhl/teams -o ./nhl_logos -t img -c "image-logo" -r "([^.]+).+(\\..+$)" -v
running the above produces the following output:
[✓] NHL.png
[✓] Ducks.png
[✓] Coyotes.png
[✓] Bruins.png
[✓] Sabres.png
[✓] Flames.png
[✓] Hurricanes.png
[✓] Blackhawks.png
[✓] Avalanche.png
[✓] BlueJackets.png
[✓] Stars.png
[✓] RedWings.png
[✓] Oilers.png
[✓] Panthers.png
[✓] Kings.png
[✓] Wild.png
[✓] Canadiens.png
[✓] Predators.png
[✓] Devils.png
[✓] Islanders.png
[✓] Rangers.png
[✓] Senators.png
[✓] Flyers.png
[✓] Penguins.png
[✓] Sharks.png
[✓] Kraken.png
[✓] Blues.png
[✓] Lightning.png
[✓] MapleLeafs.png
[✓] Canucks.png
[✓] GoldenKnights.png
[✓] Capitals.png
[✓] Jets.png
[====================] 100% Complete.
Help
usage: imgetr [-h] [-o [OUTPUT_DIR]]
[-c class [class ...]] [-t [tag [tag ...]]]
[-q [query_key]] [-u [.ext]] [-r [regex]]
[-v]
website
Download images from a website.
positional arguments:
website the website to download images from
optional arguments:
-h, --help show this help message and exit
-o [OUTPUT_DIR], --output_dir [OUTPUT_DIR]
the directory to download the files into
-c class [class ...], --class_list class [class ...]
list of css classes the images on the page contains
-t [tag [tag ...]], --tag [tag [tag ...]]
list of html tags where images are contained
-q [query_key], --query_key [query_key]
query key in an img src that contains image filename
-u [.ext], --unknown_img_ext [.ext]
name of ext for images with no ext
-r [regex], --rename [regex]
regex pattern selecting groups of the output image
filename to be concat together
-v, --verbose print out the name of each file downloaded
TODO
- [] add option for selenium (headless) to download images which load on the page with javascript.
- [] add testing
- [] add more examples
Working on imgetr
To create conda environment:
conda env create -f environment.yml
To remove conda environment:
conda remove --name imgetr --all
To update requirements.txt:
conda env create -f environment.yml
pip freeze > requirements.txt
Before publishing anything
pip install --upgrade build
python -m build
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file imgetr-1.0.0.tar.gz.
File metadata
- Download URL: imgetr-1.0.0.tar.gz
- Upload date:
- Size: 4.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cf60dd1a2f709ccfcb2c768e5b5db7c4d2b37709a3c9da7215d4289a2146b695
|
|
| MD5 |
94ad3e4ee2393f22779bb2c8d8a5e69d
|
|
| BLAKE2b-256 |
fccd0d4899faf70607f37ebc9c6a57c232a7a036aac45e688f693434ccfeef49
|
File details
Details for the file imgetr-1.0.0-py3.7.egg.
File metadata
- Download URL: imgetr-1.0.0-py3.7.egg
- Upload date:
- Size: 7.0 kB
- Tags: Egg
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
afd9c1179920cebdff659656bffafc0179229d377738ce878fe124514cd58524
|
|
| MD5 |
99e75a0395a111c28bc6de1f754e0c19
|
|
| BLAKE2b-256 |
701e383de90d918b6d89264abdb9453ec54f4c7ec0b6858952ccfc2f875202a2
|
File details
Details for the file imgetr-1.0.0-py3-none-any.whl.
File metadata
- Download URL: imgetr-1.0.0-py3-none-any.whl
- Upload date:
- Size: 5.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.7.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b82bd305f278c946469cc7eae6905d6984294332bf821974d4623b3821d7a847
|
|
| MD5 |
557a7c20185c243671026696a7799258
|
|
| BLAKE2b-256 |
31ebf3c32d2d71a5d4e3b52ccbb05328827679b30f221fe0d60c6cffe9bdd856
|