Skip to main content

Simple Python multithreaded proxy manager, focused on scrapes

Project description

SimpleProxyManager

Python multithreaded proxy manager, focused on scrapes

This project lives at https://github.com/amagnasco/SimpleProxyManager. Feel free to submit an issue or improvements!

Warning: if using on macOS, don't use this if also using os.fork() due to dependency urllib.request

For an example implementation, check out dev/example.py.

Major dependencies: requests, urllib3.

Constructor inputs:

  • threads: number of processing threads to use
  • wait: minimum and maximum time to wait between requests, and HTTP timeout. All in seconds.
  • headers: HTTP headers to use (user agent, accept, and accept-language)
  • test: URI to use for health check, and minimum and maximum time to wait between each proxy healthcheck (in seconds)

Public API:

  • Setup:
    • load: inputs a path to a list of proxies, ingests and tests them.
  • Monitoring:
    • healthcheck: processes the "all" queue into "ready" or "broken" queues
    • available: returns the list of available proxies
    • broken: returns the list of broken proxies
  • Use:
    • validate: inputs a URI, and runs it through urllib's parse
    • req: inputs a URI, validates it, assigns a proxy, and runs get. Returns {success: True, data: Response}, or {success: False, error: Exception}.
    • get: inputs a proxy and URI, and retrieves it. For advanced usage like externally queued/threaded/async'd setups.

Version History

This project uses semantic versioning.

  • to-do:

    • improve usage docs
    • improve error handling
    • add test cases
    • add i18n
    • improve HTTP status code handling
    • differentiate input proxy list by http/https
    • publish to pypi
    • add a "reset queues to all" method
    • improve manual exit handling
    • publish github package
    • abstract proxy assigner method from req
    • improve input and type checking
  • 0.1.0:

    • First release! Functional enough to share, but some logs might still be in Spanish while I sort out the i18n.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

simpleproxymanager-0.1.0.tar.gz (42.1 kB view details)

Uploaded Source

Built Distribution

SimpleProxyManager-0.1.0-py3-none-any.whl (29.6 kB view details)

Uploaded Python 3

File details

Details for the file simpleproxymanager-0.1.0.tar.gz.

File metadata

  • Download URL: simpleproxymanager-0.1.0.tar.gz
  • Upload date:
  • Size: 42.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.0

File hashes

Hashes for simpleproxymanager-0.1.0.tar.gz
Algorithm Hash digest
SHA256 45a8787f657e5f4e72a2fb460c5b25a014ab26fbff764f1f63133fa53e0c64ba
MD5 2917a060bdbbb3b1ab1cf9c337f5bb3b
BLAKE2b-256 b1cc9132ccb783bf56fa1af9ad0da821582a6b2beec92ab4db6d3324c8906ee0

See more details on using hashes here.

File details

Details for the file SimpleProxyManager-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for SimpleProxyManager-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 23fe5645bccfa141933c0c590ab323169c4fd5757f09bb78a360c66de2c3bfa1
MD5 df6fa686e08f0fca7fce5966072a917e
BLAKE2b-256 87d855e0081be6891d72c0bf4df1477def630db136690ea0baefaa49e9b87af9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page