Skip to main content

Next-gen Anti-Detection Scraper with AI & Browser Fallback

Project description

TitanScraper V2

TitanScraper Banner

The Ultimate Anti-Bot Scraper for Python.

TitanScraper is a high-performance scraping library designed to bypass the toughest anti-bot protections (Cloudflare, Akamai, Datadome, etc.). It uses a tiered approach, starting with lightweight requests and escalating to full browser automation with AI solvers only when necessary.

Features

  • Tier 1: Intelligent Requests: Handles headers, TLS fingerprinting (simulated), and simple redirects.
  • Tier 2: JSD Solver: Native Go-based solver for Cloudflare JavaScript challenges.
  • Tier 3: Browser Fallback: Auto-launches a stealth Playwright browser for 403/503 bypass (G2, CoinList, etc.).
  • Tier 4: Captcha Solving:
    • Cloudflare Turnstile: Auto-detects and human-clicks or uses external solvers.
    • reCAPTCHA v2/v3: Native audio solving + Support for 2Captcha, CapMonster, Anti-Captcha.
    • AI Custom Model: PyTorch-based CNN for text captchas.
  • Deep Fingerprint Spoofing: Injects noise into Canvas, WebGL, and AudioContext to defeat device tracking.
  • Session Persistence: Save/Load cookies to build "Trust Scores" across sessions.
  • Smart Auto-Detection: Automatically identifies protection (Cloudflare, Akamai, AWS WAF) and selects the best bypass strategy (TLS Rotation, Browser, etc.).
  • Proxies & Stealth: Built-in support for rotating proxies and fingerprint randomization (User-Agent, Viewport, Locale).

Installation

# 1. Install Python packages
pip install TitanScraper-Pro

# 2. Install Playwright Browsers
playwright install chromium

# 3. Setup JSD Solver (Go required)
python setup_jsd.py

# 4. System Requirements
# Install ffmpeg for Audio Captcha solving

Usage

1. One-Click Bypass (Recommended)

Automatically handles challenges, captchas, and fallbacks.

from titan import TitanScraper

# Optional: Add Proxies
proxies = {
    "http": "http://user:pass@host:port",
    "https": "http://user:pass@host:port"
}

scraper = TitanScraper(proxies=proxies)

# Just provide the URL
response = scraper.bypass("https://nowsecure.nl")

print(response.status_code)
print(response.content)

2. Advanced / Manual Control

# Access specific modules
scraper.jsd_solver.solve(url)
cookies = scraper.browser_manager.get_cookies(url)

# Activate Disguise System (100% Consistency)
scraper.set_disguise("modern_mac") # or "modern_windows"

3. Disguise System (Consistency Engine)

For targets detecting "Mismatched Fingerprints" (Cloudflare V3/Enterprise):

# 1. Masquerade as a Mac user (Safari + MacIntel + Apple GPU)
scraper.set_disguise("modern_mac")

# 2. Masquerade as a Windows user (Chrome + Win32 + NVIDIA)
scraper.set_disguise("modern_windows")

# Now all requests will perfectly match this identity.
scraper.bypass("https://strict-site.com")

Why use this? High-end antibots check if your User-Agent matches your TLS Fingerprint and your GPU Renderer.

  • If you just change User-Agent to "iPhone" but use Python TLS, you get banned.
  • The Disguise System syncs everything to match the chosen profile.

Available Profiles & Parameters

Parameter modern_windows modern_mac
User-Agent Chrome 120 (Win) Safari 15.3 (Mac)
TLS Handshake chrome120 safari15_3
Navigator Platform Win32 MacIntel
WebGL Vendor Google Inc. (NVIDIA) Apple Inc.
WebGL Renderer NVIDIA GeForce RTX 3060... Apple GPU
Hardware Core Count 16 8
Device Memory (GB) 8 8
Default Viewport 1920x1080 1440x900

4. External Captcha Providers

Use professional services for 100% reliability on hard targets.

captcha_config = {
    "provider": "2captcha", # '2captcha', 'capmonster', 'anticaptcha'
    "api_key": "YOUR_API_KEY"
}

scraper = TitanScraper(captcha_config=captcha_config)
scraper.bypass("https://protected-site.com")

5. Training the AI Captcha Solver

  1. Collect Data:
    scraper.browser_manager.save_element_screenshot(url, "#captcha-img", "data/label.png")
    
  2. Train:
    python train_captcha.py --mode train --data_dir ./data --epochs 20
    
  3. Predict:
    python train_captcha.py --mode predict --image ./test.png
    

Roadmap & Suggestions

To defeat even more advanced systems:

  1. Residential Proxy Rotation: Integrate with providers (BrightData, Smartproxy) to rotate IPs per request.
  2. Machine Learning Behavior: Train a model on real user mouse movements instead of just Bezier curves.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

titanscraper_pro-2.1.0.tar.gz (26.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

titanscraper_pro-2.1.0-py3-none-any.whl (30.9 kB view details)

Uploaded Python 3

File details

Details for the file titanscraper_pro-2.1.0.tar.gz.

File metadata

  • Download URL: titanscraper_pro-2.1.0.tar.gz
  • Upload date:
  • Size: 26.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for titanscraper_pro-2.1.0.tar.gz
Algorithm Hash digest
SHA256 3ce103dd3a036d471fc086ee4d11a1eafe75541b05460dbdad653070b249a89f
MD5 0f752b38d28e028e425e47bec7161c91
BLAKE2b-256 039330b4424ec200b8c68fe69ef6fe857b1700f09403de114338c03ded38f8f5

See more details on using hashes here.

File details

Details for the file titanscraper_pro-2.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for titanscraper_pro-2.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dcf1e52f51eab77ea07f8b7c8caa3e65cf85ff2b311be1983ecbf92576a27fef
MD5 81645bf1fcd8dad425203980ae188845
BLAKE2b-256 f2cecf49f0ebcc252363069539b8f08c20e95c0fa82bf83991214357c3ec3da9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page