Skip to main content

A blazing fast, async-first, undetectable webscraping/web automation framework

Project description

Chuscraper 🚀

The Undetectable Web Scraping Framework

chuscraper is a powerful, async-first web automation library designed to bypass the toughest anti-bot protections (Akamai, Cloudflare, Datadome, etc.). It is built on top of the Chrome DevTools Protocol (CDP) and includes advanced stealth techniques out-of-the-box.


🔥 Key Features

  • 🛡️ Undetectable Stealth Mode:

    • Automatically hides navigator.webdriver.
    • Mocks navigator.permissions, navigator.plugins, and navigator.mimeTypes.
    • Pro Features:
      • Canvas & WebGL Noise: Randomizes fingerprinting to avoid tracking.
      • Hardware Spoofing: Simulates high-end PC specs (8 Cores, 8GB RAM).
      • Smart UA Rotation: Rotates modern Desktop User-Agents per session.
  • 🔒 Built-in Proxy Auth:

    • Direct CDP-based proxy authentication (no extensions required).
    • Supports http://user:pass@host:port format seamlessly.
    • Bypasses proxy authentication popups automatically.
  • 🌍 Timezone & Geolocation:

    • Automatically overrides system timezone to match your proxy (e.g., Asia/Kolkata).
  • ⚡ Blazing Fast:

    • Uses specific CDP commands to avoid bloat.
    • Lightweight and optimized for high-concurrency scraping.

📦 Installation

# Clone the repo
git clone https://github.com/ToufiqQureshi/chuscraper.git
cd chuscraper

# Install dependencies (if any specific ones, otherwise uses standard libs)
pip install -e .

🚀 Quick Start

1. Basic Usage (Stealth + Proxy)

import asyncio
import chuscraper

async def main():
    # Start browser with Stealth Mode and Proxy
    browser = await chuscraper.start(
        stealth=True,
        proxy="http://user:pass@proxy.example.com:8080",
        timezone="Asia/Kolkata"  # Match your proxy location
    )

    page = await browser.get("https://whoer.net")
    
    # Verify IP and camouflage
    print(f"Title: {await page.title}")
    
    await asyncio.sleep(10)
    await browser.stop()

if __name__ == "__main__":
    asyncio.run(main())

🛠️ Configuration

Argument Type Description
stealth bool Enable advanced anti-detection (Canvas noise, Hardware mocks, UA rotation).
proxy str Proxy URL in scheme://user:pass@host:port format.
timezone str Override browser timezone (e.g., "Asia/Kolkata").
headless bool Run in headless mode (default: False).

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chuscraper-0.16.0.tar.gz (366.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chuscraper-0.16.0-py3-none-any.whl (359.3 kB view details)

Uploaded Python 3

File details

Details for the file chuscraper-0.16.0.tar.gz.

File metadata

  • Download URL: chuscraper-0.16.0.tar.gz
  • Upload date:
  • Size: 366.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chuscraper-0.16.0.tar.gz
Algorithm Hash digest
SHA256 ca6f64fd199915e8faff8524737de79ed613fb39601982c50ba53f73ca6e05ff
MD5 2aae6ea0abee4cb4856bc9cee99d369c
BLAKE2b-256 a4ed0aaf36cad6285767cd708aebb7cee634b6f85e9bb3765e4f318041333765

See more details on using hashes here.

Provenance

The following attestation bundles were made for chuscraper-0.16.0.tar.gz:

Publisher: publish.yml on ToufiqQureshi/chuscraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file chuscraper-0.16.0-py3-none-any.whl.

File metadata

  • Download URL: chuscraper-0.16.0-py3-none-any.whl
  • Upload date:
  • Size: 359.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chuscraper-0.16.0-py3-none-any.whl
Algorithm Hash digest
SHA256 41f6abca3dd4d45d8731c6cda4b82a8a2d7aa1a3d1eaaad43db5e99d116c8e1c
MD5 cc5e9bbc3d99b9c31a6338310a9deae6
BLAKE2b-256 92f48b15e3f465cf257e39acde68eb48dd523e9ce00bb5c0336921a353b2d0e3

See more details on using hashes here.

Provenance

The following attestation bundles were made for chuscraper-0.16.0-py3-none-any.whl:

Publisher: publish.yml on ToufiqQureshi/chuscraper

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page