Skip to main content

Generate lists of free, reliable http(s) proxies.

Project description

grey_harvest

Scrapes the web for reliable http/https proxies and tests them for speed and reliability. Can be used as both a python module and a command line utility. When run as a command line utility, proxies are sent to stdout. When run as a module, it returns a generator.

Check out the project on PyPi at https://pypi.python.org/pypi/grey_harvest/0.1.3.5

Key Features

  • Quickly and easily generate a list of reliable http/https proxies
  • Usable as a comamnd line utility or a python module
  • Can filter for proxies that support SSL
  • Can filter for proxies locationed within specific countries
  • Can exclude proxies located within specific countries

Installation

First, install the following dependencies:

# On Centos/RHEL/Fedora:
sudo yum install python-devel libxlt-devel libxml-devel

# On Debian/Ubuntu:
sudo apt-get install python-dev libxml2-dev libxslt1-dev

Then install grey_harvest using pip as follows:

pip install grey_harvest

Usage

We can generate a list of 10 viable proxies with the following command:

# use the -n flag to specify number of proxies to generate
grey_harvest -n 10

To select only proxies with SSL enabled, we do this:

# use the -H flag to select only https proxies
grey_harvest -n 10 -H

We can use the -a flag to filter for proxies located within a list of specific countries. For example, to choose proxies located within Ukraine, Hong Kong, and the United States, we’d use this:

# use the -a flag to filter by country
grey_harvest -a "United States" "Hong Kong" Ukraine -n 10

We can use the -p flag to filter for ports running on specific ports:

# the -p flag to only use proxies that run on port 80
grey_harvest -p 80 -n 10

We can deny proxies located within specific countries by using the -d flag. Proxies located within China are blocked by default as they are often located behind the Great Firewall, and as such tend to be unreliable. This can be changed within grey_harvest.py’s internal configs.:

# use the -d flag to deny proxies located within France and
# Germany
grey_harvest -d France Germany -n 10 -H

grey_harvest library - basic example

Before diving into the documentation for the grey_harvest library, check out how easily we can generate a list of 20 proxies:

import gray_harvest

''' spawn a harvester '''
harvester = grey_harvest.GreyHarvester()

''' harvest some proxies from teh interwebz '''
count = 0
for proxy in harvester.run():
        print proxy
        count += 1
        if count >= 20:
                break

That’s it. We now have 20 http/https proxies ready to go.

History

0.1.5 (2016-04-20)

  • Fixed connections errors that occur when specifying custom test domain

0.1.4 (2016-04-19)

  • Users can now filter for proxies running on specific ports

0.1.3 (2015-05-26)

  • Added documentation

0.1.2 (2015-05-26)

  • Corrected some build issues

0.1.0 (2015-05-26)

  • Initial release

Credits

“grey_harvest” is written and maintained by Gabriel ‘s0lst1c3’ Ryan.

Contributors

Please add yourself here alphabetically when you submit your first pull request.

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Filename, size & hash SHA256 hash help File type Python version Upload date
grey_harvest-0.1.5.tar.gz (6.7 kB) Copy SHA256 hash SHA256 Source None

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN SignalFx SignalFx Supporter DigiCert DigiCert EV certificate StatusPage StatusPage Status page