Simple Python multithreaded proxy manager, focused on scrapes
Project description
SimpleProxyManager
Python multithreaded proxy manager, focused on scrapes
This project lives at https://github.com/amagnasco/SimpleProxyManager. Feel free to submit an issue or improvements! Releases: https://pypi.org/project/SimpleProxyManager/
Warning: if using on macOS, don't use this if also using os.fork() due to dependency urllib.request
For an example implementation, check out dev/example.py.
Major dependencies: requests, urllib3.
Constructor inputs:
- threads: number of processing threads to use
- wait: minimum and maximum time to wait between requests, and HTTP timeout. All in seconds.
- headers: HTTP headers to use (user agent, accept, and accept-language)
- test: URI to use for health check, and minimum and maximum time to wait between each proxy healthcheck (in seconds)
Public API:
- Setup:
- load: inputs a path to a list of proxies, ingests and tests them.
- Monitoring:
- healthcheck: processes the "all" queue into "ready" or "broken" queues
- available: returns the list of available proxies
- broken: returns the list of broken proxies
- Use:
- validate: inputs a URI, and runs it through urllib's parse
- req: inputs a URI, validates it, assigns a proxy, and runs get. Returns {success: True, data: Response}, or {success: False, error: Exception}.
- get: inputs a proxy and URI, and retrieves it. For advanced usage like externally queued/threaded/async'd setups.
Version History
This project uses semantic versioning.
-
to-do:
- improve usage docs
- improve error handling
- add test cases
- improve HTTP status code handling
- differentiate input proxy list by http/https
- add a "reset queues to all" method
- improve manual exit handling
- abstract proxy assigner method from req
- improve input and type checking
-
0.2.0 (in development):
- add i18n
- consolidate load conf into one dict/json
- improve input and type checking
- improve exit condition when 0 proxies available
-
0.1.2:
- handle HTTPErrors into general exceptions
- simplify class name and init for cleaner import
- updated example
-
0.1.1:
- added GitHub > PyPi publication workflow
-
0.1.0:
- First release! Functional enough to share, but some logs might still be in Spanish while I sort out the i18n.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file simpleproxymanager-0.1.2.tar.gz
.
File metadata
- Download URL: simpleproxymanager-0.1.2.tar.gz
- Upload date:
- Size: 42.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 14a5b54239adf95a00c27153618db886561772de716851cb93c007d6b8f30e5f |
|
MD5 | a5fc09139f082806d0af91ba6f2d6d8d |
|
BLAKE2b-256 | ec88e71bc23e3a12cb40262f5c1dc9c5227639abf8d94d0c070ed767014f3693 |
File details
Details for the file SimpleProxyManager-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: SimpleProxyManager-0.1.2-py3-none-any.whl
- Upload date:
- Size: 29.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9aee42e4fc20c58d64543ea7685ba07d0f8b0978b29e8637f712701da87e7a23 |
|
MD5 | e1815472b0f1b6b584428fab9b85e426 |
|
BLAKE2b-256 | 1c1c3f91fda0dd2295025f96ef327dc72ba90b297c39eee05282e675e36512b8 |