A high-performance Python library for managing concurrent HTTP requests through multiple proxy servers
Project description
proxy-fleet 🚢
A high-performance Python library for managing concurrent HTTP requests through multiple proxy servers with intelligent health monitoring and automatic failover.
✨ Features
- 🔄 Automated proxy health checking - Continuously monitor proxy server availability
- ⚡ Concurrent request processing - Execute multiple HTTP requests simultaneously
- 🎯 Intelligent proxy rotation - Automatically distribute load across healthy proxies
- 📊 Failure tracking & recovery - Smart failover with automatic proxy re-enablement
- 💾 Persistent configuration - JSON-based proxy management with state persistence
- 🛠️ Flexible integration - Use as a library or command-line tool
- 📝 Comprehensive logging - Detailed request/response tracking with proxy attribution
- 🔒 Authentication support - Handle username/password proxy authentication
- 🚫 Automatic proxy blacklisting - Remove unreliable proxies after consecutive failures
- 💿 Response data storage - Save successful responses with metadata
- 🧪 SOCKS proxy validation - Fast raw socket validation inspired by TheSpeedX/socker
- 📥 Automatic proxy discovery - Download and validate proxies from TheSpeedX/PROXY-List
🚀 Quick Start
Installation
pip install proxy-fleet
Command Line Usage
proxy-fleet provides six main usage scenarios:
Scenario 1 - Validate input proxy servers
# From file
proxy-fleet --test-proxy-server proxies.txt
# From stdin (thanks to https://github.com/TheSpeedX/PROXY-List for proxy contributions)
curl -sL 'https://raw.githubusercontent.com/TheSpeedX/SOCKS-List/master/socks5.txt' | proxy-fleet --test-proxy-server - --concurrent 100 --test-proxy-timeout 10 --test-proxy-with-request 'https://ipinfo.io'
Scenario 2 - Validate existing proxy servers in storage
# Test existing proxies
proxy-fleet --test-proxy-storage
Scenario 3 - List current proxy servers in storage
# List all proxy status in JSON format
proxy-fleet --list-proxy
# List only verified/valid proxies
proxy-fleet --list-proxy-verified
# List only failed/invalid proxies
proxy-fleet --list-proxy-failed
Scenario 4 - Remove failed proxy servers from storage
# Clean up failed/invalid proxies from storage
proxy-fleet --remove-proxy-failed
Scenario 5 - Execute HTTP requests through proxy servers
# Execute tasks from file
proxy-fleet --task-input tasks.json
# Execute tasks from stdin
cat tasks.json | proxy-fleet --task-input -
Scenario 6 - Retry failed tasks
# Retry previously failed tasks
proxy-fleet --task-retry
Scenario 7 - List current task results
# Show task execution statistics
proxy-fleet --list-task-result
CLI Options
-- --test-proxy-type [socks4|socks5|http] - Proxy type (default: socks5)
-- --test-proxy-timeout INTEGER - Proxy connection timeout in seconds
-- --test-proxy-with-request TEXT - Additional HTTP request validation
-- --proxy-storage TEXT - Proxy state storage directory (default: proxy)
-- --list-proxy - List all proxy server status in JSON format
-- --list-proxy-verified - List only verified/valid proxy servers in JSON format
-- --list-proxy-failed - List only failed/invalid proxy servers in JSON format
-- --remove-proxy-failed - Remove all failed/invalid proxy servers from proxy storage
-- --task-output-dir TEXT - Task output directory (default: output)
-- --concurrent INTEGER - Maximum concurrent connections (default: 10)
-- --verbose - Show verbose outputct/proxy-fleet)
Library Usage
import asyncio
from proxy_fleet import ProxyFleet, HttpTask, HttpMethod, FleetConfig
async def main():
# Create configuration
config = FleetConfig(
proxy_file="proxies.json",
output_dir="output",
max_concurrent_requests=20
)
# Initialize the proxy fleet
fleet = ProxyFleet(config)
# Load proxy servers
proxy_list = [
{"host": "proxy1.example.com", "port": 8080},
{"host": "proxy2.example.com", "port": 8080, "username": "user", "password": "pass"},
{"host": "proxy3.example.com", "port": 3128, "protocol": "https"}
]
await fleet.load_proxies(proxy_list)
# Create HTTP tasks
tasks = [
HttpTask(
task_id="get_test",
url="https://httpbin.org/get",
method=HttpMethod.GET,
headers={"User-Agent": "ProxyFleet/1.0"}
),
HttpTask(
task_id="post_test",
url="https://httpbin.org/post",
method=HttpMethod.POST,
data={"key": "value"},
headers={"Content-Type": "application/json"}
),
HttpTask(
task_id="ip_check",
url="https://ipinfo.io/json"
)
]
# Execute tasks with automatic proxy rotation
results = await fleet.execute_tasks(tasks, output_dir="./results")
for result in results:
print(f"Task {result.task_id}: {result.status}")
print(f"Used proxy: {result.proxy_used}")
print(f"Response time: {result.response_time}s")
if __name__ == "__main__":
asyncio.run(main())
📋 Task Configuration
Create a tasks.json file for HTTP request tasks:
[
{
"id": "check_ip",
"url": "https://ipinfo.io/json",
"method": "GET",
"headers": {
"User-Agent": "proxy-fleet/1.0"
}
},
{
"id": "post_data",
"url": "https://httpbin.org/post",
"method": "POST",
"headers": {
"Content-Type": "application/json"
},
"data": {
"test": "data"
}
}
]
� Output Structure
proxy-fleet creates organized output directories:
proxy/ # Proxy storage directory (default)
├── proxy.json # Proxy server status and statistics
└── test-proxy-server.log # Proxy validation logs
output/ # Task execution results (default)
├── done.json # Successful task results
└── fail.json # Failed task results
🔍 Monitoring & Logging
Built-in Monitoring
- Health Checks: Automatic proxy health monitoring
- Failure Tracking: Recent failure count with time windows
- Performance Metrics: Response time tracking
- Success Rates: Per-proxy success/failure statistics
Logging Configuration
from proxy_fleet.utils import setup_logging
# Configure logging
setup_logging(log_file="proxy_fleet.log", level="INFO")
🚫 Failure Handling
proxy-fleet implements intelligent failure handling:
- Recent Failure Tracking: Count failures in rolling time window
- Automatic Blacklisting: Remove proxies exceeding failure threshold
- Health Recovery: Automatically re-test unhealthy proxies
- Graceful Degradation: Continue with remaining healthy proxies
- Task Retries: Configurable retry logic with different proxies
🎯 Use Cases
- Web Scraping: Distribute requests across multiple IPs
- API Testing: Test services through different proxy locations
- Load Testing: Generate traffic from multiple sources
- Data Collection: Gather data while respecting rate limits
- Proxy Maintenance: Monitor and manage proxy server fleets
📊 Performance
- Concurrent Execution: Configurable concurrency limits
- Async I/O: Non-blocking request processing
- Memory Efficient: Streaming response handling
- Scalable: Supports hundreds of concurrent requests
- Fast Failover: Quick detection and bypass of failed proxies
🔧 Requirements
- Python 3.8+
- aiohttp >= 3.8.0
- aiofiles >= 0.8.0
- pydantic >= 1.10.0
- click >= 8.0.0
- rich >= 12.0.0
Optional:
- aiohttp-socks >= 0.7.0 (for SOCKS proxy support)
📝 Examples
See the examples/ directory for complete usage examples:
basic_usage.py- Basic library usageexample_tasks.json- Sample HTTP tasksexample_proxies.json- Sample proxy configuration
🤝 Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass
- Submit a pull request
📄 License
MIT License - see LICENSE file for details.
🎉 Changelog
v0.1.0
- Initial release
- Basic proxy fleet management
- Health monitoring system
- CLI tool
- Comprehensive documentation
proxy-fleet - Manage your proxy servers like a fleet! 🚢
SOCKS Proxy Validation
proxy-fleet includes fast SOCKS proxy validation inspired by TheSpeedX/socker and uses proxy lists from TheSpeedX/PROXY-List:
Quick Proxy Testing Example
Thanks to TheSpeedX/PROXY-List for providing public proxy lists:
# Test SOCKS5 proxies from TheSpeedX repository
curl -sL 'https://raw.githubusercontent.com/TheSpeedX/SOCKS-List/master/socks5.txt' | proxy-fleet --test-proxy-server - --concurrent 100 --test-proxy-timeout 10 --test-proxy-with-request 'https://ipinfo.io'
# Test HTTP proxies
curl -sL 'https://raw.githubusercontent.com/TheSpeedX/PROXY-List/master/http.txt' | proxy-fleet --test-proxy-server - --test-proxy-type http --concurrent 50
Library Usage for SOCKS Validation
from proxy_fleet.utils.socks_validator import SocksValidator
async def validate_socks_proxies():
validator = SocksValidator(timeout=5.0, check_ip_info=True)
# Validate SOCKS5 proxy
result = await validator.async_validate_socks5('proxy.example.com', 1080)
if result.is_valid:
print(f"✅ Proxy is valid")
if result.ip_info:
print(f" IP: {result.ip_info.get('ip')}")
print(f" Country: {result.ip_info.get('country')}")
else:
print(f"❌ Proxy validation failed: {result.error}")
Two-Stage Proxy Validation
Combine fast SOCKS validation with HTTP testing for optimal proxy discovery:
# Stage 1: Download and validate SOCKS proxies (thanks to TheSpeedX/PROXY-List)
curl -sL 'https://raw.githubusercontent.com/TheSpeedX/SOCKS-List/master/socks5.txt' | proxy-fleet --test-proxy-server - --concurrent 100 --test-proxy-timeout 5
# Stage 2: Test HTTP requests through validated proxies
proxy-fleet --test-proxy-storage --test-proxy-with-request 'https://httpbin.org/ip' --test-proxy-timeout 10
# List final valid proxies
proxy-fleet --list-proxy
Using Library for Two-Stage Validation
import asyncio
from proxy_fleet.utils.socks_validator import SocksValidator
async def two_stage_validation():
validator = SocksValidator(timeout=3.0, check_ip_info=True)
# Stage 1: Fast SOCKS handshake validation
proxy_lines = [
"proxy1.example.com:1080",
"proxy2.example.com:1080",
"proxy3.example.com:1080"
]
quick_valid = []
for line in proxy_lines:
host, port = line.split(':')
result = await validator.async_validate_socks5(host, int(port))
if result.is_valid:
quick_valid.append({'host': host, 'port': int(port)})
print(f"Stage 1: {len(quick_valid)}/{len(proxy_lines)} passed SOCKS validation")
# Stage 2: Use CLI for HTTP validation
# Save validated proxies to file and use --test-proxy-storage
with open('validated_proxies.txt', 'w') as f:
for proxy in quick_valid:
f.write(f"{proxy['host']}:{proxy['port']}\n")
print("Run: proxy-fleet --test-proxy-server validated_proxies.txt --test-proxy-with-request 'https://httpbin.org/ip'")
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file proxy_fleet-1.0.1.tar.gz.
File metadata
- Download URL: proxy_fleet-1.0.1.tar.gz
- Upload date:
- Size: 36.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
505dc2d14af086a8d6ecb8ed53aee1f8ee68bbfc7fa2ce6cd36087a5dfbe5e22
|
|
| MD5 |
c5d0a4e6c90862156f1a6d7c55d3ac54
|
|
| BLAKE2b-256 |
38be67e1ffe6eec0706603c0a00c96ef34d79dd9c29191535ed8049c16ea98ed
|
Provenance
The following attestation bundles were made for proxy_fleet-1.0.1.tar.gz:
Publisher:
python-publish.yml on changyy/py-proxy-fleet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
proxy_fleet-1.0.1.tar.gz -
Subject digest:
505dc2d14af086a8d6ecb8ed53aee1f8ee68bbfc7fa2ce6cd36087a5dfbe5e22 - Sigstore transparency entry: 258543624
- Sigstore integration time:
-
Permalink:
changyy/py-proxy-fleet@7a7a7ecf6c05045e71949ddb6ffe5a4752abb01d -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/changyy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7a7a7ecf6c05045e71949ddb6ffe5a4752abb01d -
Trigger Event:
release
-
Statement type:
File details
Details for the file proxy_fleet-1.0.1-py3-none-any.whl.
File metadata
- Download URL: proxy_fleet-1.0.1-py3-none-any.whl
- Upload date:
- Size: 36.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7f4d1a7db6f198ac13778711df94cdf4382d552e052c70ae188b057bd106878
|
|
| MD5 |
2171cf5b6d6a71aab12219cfe29c8ee1
|
|
| BLAKE2b-256 |
5cda6d08e5e560916f211f52a7360796990104806bfda4ef951a0a9444bcfe46
|
Provenance
The following attestation bundles were made for proxy_fleet-1.0.1-py3-none-any.whl:
Publisher:
python-publish.yml on changyy/py-proxy-fleet
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
proxy_fleet-1.0.1-py3-none-any.whl -
Subject digest:
d7f4d1a7db6f198ac13778711df94cdf4382d552e052c70ae188b057bd106878 - Sigstore transparency entry: 258543632
- Sigstore integration time:
-
Permalink:
changyy/py-proxy-fleet@7a7a7ecf6c05045e71949ddb6ffe5a4752abb01d -
Branch / Tag:
refs/tags/v1.0.1 - Owner: https://github.com/changyy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
python-publish.yml@7a7a7ecf6c05045e71949ddb6ffe5a4752abb01d -
Trigger Event:
release
-
Statement type: