Skip to main content

Python module for searching and downloading images from Ecosia

Project description

python-ecosia-images

Python module for searching and downloading images from Ecosia

Installing

pip install ecosia-images

Setup

The only requisite to work with the library is to have a web browser and its driver installed. Currently the package works with either Google Chrome or Firefox.

If using Google Chrome it is required to also have Chromedriver installed and reachable in PATH. See the following link for more information.

As for Firefox, Geckodriver is required to be installed and reachable in PATH.

Examples

Start a crawler

>>> from ecosia_images import crawler
>>> searcher = crawler()

The browser to be used can be passed to the crawler constructor

>>> from ecosia_images import crawler
>>> searcher = crawler(browser='firefox')

To see all valid browser options, see ecosia_images.browser_options.

>>> from ecosia_images import browser_options
>>> browser_options
['chrome', 'firefox']

Search images and get the links to the pictures

After declaring a crawler and using it to search for a keyword, the resulting links will be accesible by the links property

>>> searcher = crawler()
>>> searcher.search('number 9')
>>> searcher.links
{ ... } # urls

Search with options

Searches can also include the options that Ecosia provides for refining results. The available keys and values for refining searches are stored in ecosia_images.download_options.

>>> from ecosia_images import download_options
>>> download_options
{
    'size': ['small', 'medium', 'large', 'wallpaper'],
    'color': ['colorOnly', 'monochrome', 'red', 'orange', 'yellow', 'green', 'teal', 'blue', 'purple', 'pink', 'brown', 'black', 'gray'],
    'image_type': ['photo', 'clipart', 'line', 'animated'],
    'freshness': ['day', 'week', 'month'],
    'license': ['share', 'shareCommercially', 'modify', 'modifyCommercially', 'public']
}

The selected options can be specified when calling the search method of the crawler.

>>> searcher.search('trees', color='monochrome', size='wallpaper')
>>> searcher.links
{ ... } # links to big pictures of trees in black and white

Gather more links

If more links are needed the function gather_more can be used.

>>> searcher.search('bees')
>>> len(searcher.links)
50  # Give or take
>>> searcher.gather_more()
>>> len(searcher.links)
100 # Give or take

Download images

In all the following cases, the script first checks whether the image has been already downloaded so it does not download it again. The functions will return the file paths to the downloaded images.

The download function will download a given number of pictures and save them in a folder whose name coincides with the keyword. This folder will be created inside the one specified when calling the constructor. In the following example, the images would be saved inside /path/to/folder/keyword/.

>>> searcher = crawler(directory='/path/to/folder/')
>>> searcher.search('keyword')
>>> searcher.download(100)
[ ... ] # list with file paths

If no folder is specified, the images will be saved inside a new folder named downloads located in the current working directory.

There is also the download_all function which will download all the currently available links in the crawler object

>>> searcher.search('pidgeons')
>>> searcher.download_all()
[ ... ]

Stoping the client

It is necessary to stop the crawler to avoid the leak of resources.

>>> searcher.stop()

Filenames

The naming convention to be used for the downloaded files can be passed to the crawler constructor. To see all valid naming options, see ecosia_images.naming_options.

>>> from ecosia_images import crawler, naming_options
>>> searcher = crawler(naming='hash')
>>> naming_options
['trim', 'hash']
Custom naming

For a specific naming convention, a function can be passed to the constructor. If you are planning to rename the files, make sure to use this functionality as renaming the files will interfere with the crawler's ability to avoid downloading duplicates.

The function must take three parameters: url, directory and keyword.

If you plan on not using the default folders provided by the library, disallow this option so no directories are created by the crawler.

>>> def custom_naming(url, directory, keyword):
...     # Function implementation
...     return filename
>>> searcher = crawler(naming_function=custom_naming, makedirs=False)

Disclaimer

The dowloaded images come from the Ecosia search engine and they may have copyrights. Do not download or use any image that violates its copyright terms. You can take advantage of the license option of the search function to avoid using copyrighted material.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ecosia_images-0.6.4.tar.gz (6.4 kB view details)

Uploaded Source

Built Distribution

ecosia_images-0.6.4-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file ecosia_images-0.6.4.tar.gz.

File metadata

  • Download URL: ecosia_images-0.6.4.tar.gz
  • Upload date:
  • Size: 6.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.6

File hashes

Hashes for ecosia_images-0.6.4.tar.gz
Algorithm Hash digest
SHA256 eb6bb95473746ef7f73162fc7844e2de1d1ee1a894021c688d93d016cc9b72ea
MD5 0ece748569d9d520514e65be92a8d18b
BLAKE2b-256 54f006be2f2ec784b3651353054c2a5ea38e4602c5b801337c0671646d1effa1

See more details on using hashes here.

File details

Details for the file ecosia_images-0.6.4-py3-none-any.whl.

File metadata

  • Download URL: ecosia_images-0.6.4-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.6

File hashes

Hashes for ecosia_images-0.6.4-py3-none-any.whl
Algorithm Hash digest
SHA256 b0d7444e796338044195d59eeece28cebe2ffffa51e474a515258b3877f89364
MD5 40cb02dd1c17ea4b78b3f4509360e8eb
BLAKE2b-256 51c60dd5d81c55198c4170e4338b46c044b772499b2794f51e720dc73b67aed6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page