Simple Python multithreaded proxy manager, focused on scrapes
Project description
SimpleProxyManager
Python multithreaded proxy manager, focused on scrapes
This project lives at https://github.com/amagnasco/SimpleProxyManager. Feel free to submit an issue or improvements!
Warning: if using on macOS, don't use this if also using os.fork() due to dependency urllib.request
For an example implementation, check out dev/example.py.
Major dependencies: requests, urllib3.
Constructor inputs:
- threads: number of processing threads to use
- wait: minimum and maximum time to wait between requests, and HTTP timeout. All in seconds.
- headers: HTTP headers to use (user agent, accept, and accept-language)
- test: URI to use for health check, and minimum and maximum time to wait between each proxy healthcheck (in seconds)
Public API:
- Setup:
- load: inputs a path to a list of proxies, ingests and tests them.
- Monitoring:
- healthcheck: processes the "all" queue into "ready" or "broken" queues
- available: returns the list of available proxies
- broken: returns the list of broken proxies
- Use:
- validate: inputs a URI, and runs it through urllib's parse
- req: inputs a URI, validates it, assigns a proxy, and runs get. Returns {success: True, data: Response}, or {success: False, error: Exception}.
- get: inputs a proxy and URI, and retrieves it. For advanced usage like externally queued/threaded/async'd setups.
Version History
This project uses semantic versioning.
-
to-do:
- improve usage docs
- improve error handling
- add test cases
- add i18n
- improve HTTP status code handling
- differentiate input proxy list by http/https
- publish to pypi
- add a "reset queues to all" method
- improve manual exit handling
- publish github package
- abstract proxy assigner method from req
- improve input and type checking
-
0.1.0:
- First release! Functional enough to share, but some logs might still be in Spanish while I sort out the i18n.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
simpleproxymanager-0.1.0.tar.gz
(42.1 kB
view details)
Built Distribution
File details
Details for the file simpleproxymanager-0.1.0.tar.gz
.
File metadata
- Download URL: simpleproxymanager-0.1.0.tar.gz
- Upload date:
- Size: 42.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 45a8787f657e5f4e72a2fb460c5b25a014ab26fbff764f1f63133fa53e0c64ba |
|
MD5 | 2917a060bdbbb3b1ab1cf9c337f5bb3b |
|
BLAKE2b-256 | b1cc9132ccb783bf56fa1af9ad0da821582a6b2beec92ab4db6d3324c8906ee0 |
File details
Details for the file SimpleProxyManager-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: SimpleProxyManager-0.1.0-py3-none-any.whl
- Upload date:
- Size: 29.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 23fe5645bccfa141933c0c590ab323169c4fd5757f09bb78a360c66de2c3bfa1 |
|
MD5 | df6fa686e08f0fca7fce5966072a917e |
|
BLAKE2b-256 | 87d855e0081be6891d72c0bf4df1477def630db136690ea0baefaa49e9b87af9 |