Next-gen Anti-Detection Scraper with AI & Browser Fallback
Project description
TitanScraper V2
The Ultimate Anti-Bot Scraper for Python.
TitanScraper is a high-performance scraping library designed to bypass the toughest anti-bot protections (Cloudflare, Akamai, Datadome, etc.). It uses a tiered approach, starting with lightweight requests and escalating to full browser automation with AI solvers only when necessary.
Features
- Tier 1: Intelligent Requests: Handles headers, TLS fingerprinting (simulated), and simple redirects.
- Tier 2: JSD Solver: Native Go-based solver for Cloudflare JavaScript challenges.
- Tier 3: Browser Fallback: Auto-launches a stealth Playwright browser for 403/503 bypass (G2, CoinList, etc.).
- Tier 4: Captcha Solving:
- Cloudflare Turnstile: Auto-detects and human-clicks or uses external solvers.
- reCAPTCHA v2/v3: Native audio solving + Support for 2Captcha, CapMonster, Anti-Captcha.
- AI Custom Model: PyTorch-based CNN for text captchas.
- Deep Fingerprint Spoofing: Injects noise into Canvas, WebGL, and AudioContext to defeat device tracking.
- Session Persistence: Save/Load cookies to build "Trust Scores" across sessions.
- Smart Auto-Detection: Automatically identifies protection (Cloudflare, Akamai, AWS WAF) and selects the best bypass strategy (TLS Rotation, Browser, etc.).
- Proxies & Stealth: Built-in support for rotating proxies and fingerprint randomization (User-Agent, Viewport, Locale).
Installation
# 1. Install Python packages
pip install .
# 2. Install Playwright Browsers
playwright install chromium
# 3. Setup JSD Solver (Go required)
python setup_jsd.py
# 4. System Requirements
# Install ffmpeg for Audio Captcha solving
Usage
1. One-Click Bypass (Recommended)
Automatically handles challenges, captchas, and fallbacks.
from titan import TitanScraper
# Optional: Add Proxies
proxies = {
"http": "http://user:pass@host:port",
"https": "http://user:pass@host:port"
}
scraper = TitanScraper(proxies=proxies)
# Just provide the URL
response = scraper.bypass("https://nowsecure.nl")
print(response.status_code)
print(response.content)
2. Advanced / Manual Control
# Access specific modules
scraper.jsd_solver.solve(url)
cookies = scraper.browser_manager.get_cookies(url)
# Activate Disguise System (100% Consistency)
scraper.set_disguise("modern_mac") # or "modern_windows"
3. Disguise System (Consistency Engine)
For targets detecting "Mismatched Fingerprints" (Cloudflare V3/Enterprise):
# 1. Masquerade as a Mac user (Safari + MacIntel + Apple GPU)
scraper.set_disguise("modern_mac")
# 2. Masquerade as a Windows user (Chrome + Win32 + NVIDIA)
scraper.set_disguise("modern_windows")
# Now all requests will perfectly match this identity.
scraper.bypass("https://strict-site.com")
Why use this? High-end antibots check if your User-Agent matches your TLS Fingerprint and your GPU Renderer.
- If you just change User-Agent to "iPhone" but use Python TLS, you get banned.
- The Disguise System syncs everything to match the chosen profile.
Available Profiles & Parameters
| Parameter | modern_windows |
modern_mac |
|---|---|---|
| User-Agent | Chrome 120 (Win) | Safari 15.3 (Mac) |
| TLS Handshake | chrome120 |
safari15_3 |
| Navigator Platform | Win32 |
MacIntel |
| WebGL Vendor | Google Inc. (NVIDIA) |
Apple Inc. |
| WebGL Renderer | NVIDIA GeForce RTX 3060... |
Apple GPU |
| Hardware Core Count | 16 | 8 |
| Device Memory (GB) | 8 | 8 |
| Default Viewport | 1920x1080 | 1440x900 |
4. External Captcha Providers
Use professional services for 100% reliability on hard targets.
captcha_config = {
"provider": "2captcha", # '2captcha', 'capmonster', 'anticaptcha'
"api_key": "YOUR_API_KEY"
}
scraper = TitanScraper(captcha_config=captcha_config)
scraper.bypass("https://protected-site.com")
5. Training the AI Captcha Solver
- Collect Data:
scraper.browser_manager.save_element_screenshot(url, "#captcha-img", "data/label.png")
- Train:
python train_captcha.py --mode train --data_dir ./data --epochs 20
- Predict:
python train_captcha.py --mode predict --image ./test.png
Roadmap & Suggestions
To defeat even more advanced systems:
- Residential Proxy Rotation: Integrate with providers (BrightData, Smartproxy) to rotate IPs per request.
- Machine Learning Behavior: Train a model on real user mouse movements instead of just Bezier curves.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file titanscraper_pro-2.0.0.tar.gz.
File metadata
- Download URL: titanscraper_pro-2.0.0.tar.gz
- Upload date:
- Size: 27.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d5ed117c306b36b858944cd77e604f2cdba18e0140d164adbad71bee2ce2b28
|
|
| MD5 |
d2c4b8f151b40763c44417cf1f83f8f3
|
|
| BLAKE2b-256 |
528b131e922e1f10997ba31144a0ff38a6eeb3d33d005e3afd60efd474a21861
|
File details
Details for the file titanscraper_pro-2.0.0-py3-none-any.whl.
File metadata
- Download URL: titanscraper_pro-2.0.0-py3-none-any.whl
- Upload date:
- Size: 31.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aaff6ba4e73e5871d09d4e46a2b18273b2c4acbf888b9243f840fcbd8ad3da49
|
|
| MD5 |
f9972aa59d82ca360b83192d433c1a00
|
|
| BLAKE2b-256 |
55f24ebd127b8f0fd5eb3c5d2f4e7de5b3e6cbcb76f888edf2196f9dc2f2f15c
|