Skip to main content

Async finds working public proxies and concurrently checks them (type, anonymity, country). Supports HTTP(S) and SOCKS proxies.

Project description

https://img.shields.io/pypi/v/proxybroker.svg https://img.shields.io/pypi/pyversions/proxybroker.svg https://img.shields.io/travis/constverum/ProxyBroker.svg https://img.shields.io/pypi/wheel/proxybroker.svg https://img.shields.io/pypi/dm/proxybroker.svg https://img.shields.io/pypi/l/proxybroker.svg

ProxyBroker is an asynchronous finder public proxies on multiple sources and concurrently checks them (wtype, anonymity, country). Supports HTTP(S) and SOCKS!

https://raw.githubusercontent.com/constverum/ProxyBroker/master/proxybroker/data/example.gif

Features

  • Finds proxies on 50+ sources (~7k working proxies)

  • Identifies proxy in raw input data

  • Checks proxies on working with protocols: HTTP, HTTPS, SOCKS4, SOCKS5

  • Checks the level of anonymity proxy

  • Removes duplicates

Installation

To install ProxyBroker, simply:

$ pip install proxybroker

Requirements

Examples

Basic example

import asyncio
from proxybroker import Broker

loop = asyncio.get_event_loop()

proxies = asyncio.Queue(loop=loop)
broker = Broker(proxies, loop=loop)

loop.run_until_complete(broker.find())

while True:
    proxy = proxies.get_nowait()
    if proxy is None: break
    print('Found proxy: %s' % proxy)

In result you get a proxy objects:

Found proxy: <Proxy AU 0.72s [HTTP: Transparent] 1.1.1.1:80>
Found proxy: <Proxy FR 0.33s [HTTP: High, HTTPS] 2.2.2.2:3128>
Found proxy: <Proxy US 1.11s [HTTP: Anonymous, HTTPS] 8.8.8.8:8000>
Found proxy: <Proxy -- 0.45s [SOCKS4, SOCKS5] 192.168.1.2:1080>
...

Advanced example

import asyncio
from proxybroker import Broker

async def use_example(pQueue):
    while True:
        proxy = await pQueue.get()
        if proxy is None:
            break
        print('Received: %s' % proxy)

async def find_advanced_example(pQueue, loop):
    broker = Broker(queue=pQueue,
                    timeout=8,
                    attempts_conn=3,
                    max_concurrent_conn=200,
                    judges=['https://httpheader.net/', 'http://httpheader.net/'],
                    providers=['http://www.proxylists.net/', 'http://fineproxy.org/eng/'],
                    verify_ssl=False,
                    loop=loop)

    # only anonymous & high levels of anonymity for http protocol and high for others:
    types = [('HTTP', ('Anonymous', 'High')), 'HTTPS', 'SOCKS4', 'SOCKS5']
    countries = ['US', 'GB', 'DE']
    limit = 10

    await broker.find(types=types, countries=countries, limit=limit)

if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    pQueue = asyncio.Queue(loop=loop)
    # Start searching and checking.
    # At the same time, using the received proxies to another part of the program
    tasks = asyncio.gather(find_advanced_example(pQueue, loop), use_example(pQueue))
    loop.run_until_complete(tasks)

Example with your raw data instead of providers

import asyncio
from proxybroker import Broker

loop = asyncio.get_event_loop()

proxies = asyncio.Queue(loop=loop)
broker = Broker(proxies, loop=loop)

data = '''10.0.0.1:80
          OK 10.0.0.2:   80 HTTP 200 OK 1.214
          10.0.0.3;80;SOCKS5 check date 21-01-02
          >>>10.0.0.4@80 HTTP HTTPS status OK
          ...'''

# Note: At the moment, information about the type of proxies in the raw data is ignored
loop.run_until_complete(broker.find(data=data))

found_proxies = [proxies.get_nowait() for _ in range(proxies.qsize())]

Example only collect proxies (without checking)

# ...
await broker.grab(countries=['US'], limit=100)
# ...

API

Proxy properties

Property

Type

Example

Description

host

str

‘8.8.8.8’

The IP address of the proxy

port

int

80

The port of the proxy

types

dict

{‘HTTP’: ‘Anonymous’, ‘HTTPS’: None}

The dict of supported protocols and their levels of anonymity

geo

dict

{‘code’: ‘US’, ‘name’: ‘United States’}

The dict of ISO code and the full name of the country proxy location

avgRespTime

str

‘1.11’

The string with the average response time of proxy

Broker parameters

Parameter

Required

Type

Default

Description

queue

Yes

str

Queue to which will be added proxies.

timeout

No

int

8

Timeout is set to all the actions carried by the network. In seconds.

attempts_conn

No

int

3

Limiting the maximum number of connection attempts.

max_concurrent_conn

No

int or asyncio.Semaphore()

200

Limiting the maximum number of concurrent connections (as a number, or have used in your program semaphore).

providers

No

list of strings

list of ~50 sites

The list of sites that distribute proxy lists (proxy providers).

judges

No

list of strings

list of ~10 sites

The list of sites that show http-headers (proxy judges).

verify_ssl

No

bool

False

Check ssl certifications.

loop

No

asyncio event loop

None

Event loop

Broker methods

Method

Optional parameters

Description

Parameter

Description

find

data

As a source of proxies can be specified raw data. In this case, search on the sites with a proxy does not happen. By default is empy.

Searching and checking proxies with requested parameters.

types

The list of types (protocols) which must be checked. Use a tuple if you want to specify the levels of anonymity: (Type, AnonLvl). By default, checks are enabled for all types at all levels of anonymity.

countries

List of ISO country codes, which must be located proxies.

limit

Limit the search to a definite number of working proxies.

grab

countries

List of ISO country codes, which must be located proxies.

Only searching the proxies without checking their working.

limit

Limit the search to a definite number of working proxies.

show_stats

full

If is False (by default) - will show a short version of stats (without proxieslog), if is True - show full version of stats (with proxies log).

Limiting the maximum number of connection attempts.

TODO

  • Check the ping, response time and speed of data transfer

  • Check on work with the Cookies/Referrer/POST

  • Check site access (Google, Twitter, etc)

  • Check proxy on spam. Search proxy ip in spam databases (DNSBL)

  • Information about uptime

  • Checksum of data returned

  • Support for proxy authentication

  • Finding outgoing IP for cascading proxy

  • The ability to send mail. Check on open 25 port (SMTP)

  • The ability to specify the address of the proxy without port (try to connect on defaulted ports)

  • The ability to save working proxies to a file (text/json/xml)

License

Licensed under the Apache License, Version 2.0

This product includes GeoLite2 data created by MaxMind, available from http://www.maxmind.com.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

proxybroker-0.1.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

proxybroker-0.1-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file proxybroker-0.1.tar.gz.

File metadata

  • Download URL: proxybroker-0.1.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for proxybroker-0.1.tar.gz
Algorithm Hash digest
SHA256 520e49028371db7e518f49bcd93e52d3317805b3395aa5f571703619fd524db0
MD5 035d470617e39814236e948a2778d9b5
BLAKE2b-256 af058a5e592e777b82a0d921245a0ec77dc3717f83db1eb6dd8b560c3865bc4c

See more details on using hashes here.

File details

Details for the file proxybroker-0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for proxybroker-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 85ac1c613349149a7b03fdde1b2ce9c62e95e920c1e4a1441796096ddd10eb17
MD5 2b40d2c8c45af103ea227fcf2134f077
BLAKE2b-256 e348d2255df6141fba1b645f5d20ff004641cc6bfc55af1173ca13be41140f67

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page