Skip to main content

Simple Python multithreaded proxy manager, focused on scrapes

Project description

SimpleProxyManager

Python multithreaded proxy manager, focused on scrapes

This project lives at https://github.com/amagnasco/SimpleProxyManager. Feel free to submit an issue or improvements! Releases: https://pypi.org/project/SimpleProxyManager/

Warning: if using on macOS, don't use this if also using os.fork() due to dependency urllib.request

For an example implementation, check out dev/example.py.

Major dependencies: requests, urllib3.

Constructor inputs:

  • threads: number of processing threads to use
  • wait: minimum and maximum time to wait between requests, and HTTP timeout. All in seconds.
  • headers: HTTP headers to use (user agent, accept, and accept-language)
  • test: URI to use for health check, and minimum and maximum time to wait between each proxy healthcheck (in seconds)

Public API:

  • Setup:
    • load: inputs a path to a list of proxies, ingests and tests them.
  • Monitoring:
    • healthcheck: processes the "all" queue into "ready" or "broken" queues
    • available: returns the list of available proxies
    • broken: returns the list of broken proxies
  • Use:
    • validate: inputs a URI, and runs it through urllib's parse
    • req: inputs a URI, validates it, assigns a proxy, and runs get. Returns {success: True, data: Response}, or {success: False, error: Exception}.
    • get: inputs a proxy and URI, and retrieves it. For advanced usage like externally queued/threaded/async'd setups.

Version History

This project uses semantic versioning.

  • to-do:

    • improve usage docs
    • improve error handling
    • add test cases
    • improve HTTP status code handling
    • differentiate input proxy list by http/https
    • add a "reset queues to all" method
    • improve manual exit handling
    • abstract proxy assigner method from req
    • improve input and type checking
  • 0.2.0 (in development):

    • add i18n
    • consolidate load conf into one dict/json
    • improve input and type checking
    • improve exit condition when 0 proxies available
  • 0.1.2:

    • handle HTTPErrors into general exceptions
    • simplify class name and init for cleaner import
    • updated example
  • 0.1.1:

    • added GitHub > PyPi publication workflow
  • 0.1.0:

    • First release! Functional enough to share, but some logs might still be in Spanish while I sort out the i18n.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simpleproxymanager-0.1.2.tar.gz (42.4 kB view details)

Uploaded Source

Built Distribution

SimpleProxyManager-0.1.2-py3-none-any.whl (29.8 kB view details)

Uploaded Python 3

File details

Details for the file simpleproxymanager-0.1.2.tar.gz.

File metadata

  • Download URL: simpleproxymanager-0.1.2.tar.gz
  • Upload date:
  • Size: 42.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.0

File hashes

Hashes for simpleproxymanager-0.1.2.tar.gz
Algorithm Hash digest
SHA256 14a5b54239adf95a00c27153618db886561772de716851cb93c007d6b8f30e5f
MD5 a5fc09139f082806d0af91ba6f2d6d8d
BLAKE2b-256 ec88e71bc23e3a12cb40262f5c1dc9c5227639abf8d94d0c070ed767014f3693

See more details on using hashes here.

File details

Details for the file SimpleProxyManager-0.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for SimpleProxyManager-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 9aee42e4fc20c58d64543ea7685ba07d0f8b0978b29e8637f712701da87e7a23
MD5 e1815472b0f1b6b584428fab9b85e426
BLAKE2b-256 1c1c3f91fda0dd2295025f96ef327dc72ba90b297c39eee05282e675e36512b8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page