A blazing fast, async-first, undetectable webscraping/web automation framework
Project description
🕷️ Chuscraper
LLM + CDP powered stealth-focused web scraping & automation framework
You Only Scrape Once — data extraction made smarter, faster, and more resilient.
🚀 What is Chuscraper?
Chuscraper is a Python web scraping & automation library that uses CDP (Chrome DevTools Protocol) and LLMs to extract structured data, interact with pages, and automate workflows — with a heavy focus on Anti-Detection and Stealth.
With AI-powered extraction, you tell it what to extract — it figures out how.
🌟 Features
🕵️♂️ Stealth & Anti-Detection
- Hides
navigator.webdriver, user agent rotation - Canvas/WebGL noise + hardware spoofing
- Timezone & geolocation spoofing
🤖 AI-Driven Data Extraction
- Semantic extraction using LLMs
- Converts HTML into structured JSON/Pydantic
🧠 Autonomous Navigation
- Intelligent pilot (
ai_pilot) that clicks/types until goal achieved
⚡ Async + Fast
Built on async CDP, low overhead, no heavy browser bundles.
🔄 Flexible Outputs
Supports JSON, CSV, Markdown, Excel, Pydantic, and more.
🌐 Integrations
- LLM Providers: OpenAI, Gemini, Anthropic, Ollama
- Frameworks: LangChain, LlamaIndex, Agno, Crew.ai
📦 Installation
pip install chuscraper
# For AI Capabilities
pip install chuscraper[ai]
[!TIP] Use within a virtual environment to avoid conflicts.
💻 Quick Start (The "Easy" Way)
Chuscraper is designed for Zero Boilerplate. You don't need complex configuration objects just to start a stealthy session.
import asyncio
import chuscraper as zd
async def main():
# DIRECT START: Specify stealth, proxy, or headless directly in start()
async with await zd.start(headless=False, stealth=True) as browser:
# 🟢 BROWSER-LEVEL SHORTCUT
await browser.goto("https://www.makemytrip.com/")
# 🟢 INTUITIVE ALIASES (goto, title, select_text)
page = browser.main_tab
await page.goto("https://example.com")
title = await page.title()
header = await page.select_text("h1")
print(f"Bhai, Title hai: {title}")
print(f"Header: {header}")
# 🤖 AI-POWERED PILOT
print("AI is navigating...")
await page.ai_pilot("Search hotels in Goa for next weekend")
# EXTRACT structured data
result = await page.ai_extract("Get the first 3 hotels with prices")
print(result)
if __name__ == "__main__":
asyncio.run(main())
[!NOTE]
chuscraperautomatically handles Chrome process cleanup and Local Proxy lifecycle.
🤖 AI Usage with Providers
Chuscraper supports multiple providers out-of-the-box.
1. Gemini (Native)
from chuscraper.ai.providers import GeminiProvider
provider = GeminiProvider(api_key="YOUR_GEMINI_API_KEY")
await page.ai_extract("Extract data", provider=provider)
2. OpenAI
from chuscraper.ai.providers import OpenAIProvider
provider = OpenAIProvider(api_key="YOUR_OPENAI_API_KEY")
await page.ai_extract("Extract data", provider=provider)
🛡️ Stealth & Anti-Detection Proof
We don't just claim to be stealthy; we prove it. Below are the results from top anti-bot detection suites, all passed with 100% "Human" status.
👉 View Full Visual Proofs & Screenshots Here
| Detection Suite | Result | Status |
|---|---|---|
| SannySoft | No WebDriver detected | ✅ Pass |
| BrowserScan | 100% Trust Score | ✅ Pass |
| PixelScan | Consistent Fingerprint | ✅ Pass |
| IPHey | Software Clean (Green) | ✅ Pass |
| CreepJS | 0% Stealth / 0% Headless | ✅ Pass |
| Fingerprint.com | No Bot Detected | ✅ Pass |
🌍 Real-World Protection Bypass
We tested chuscraper against live websites protected by major security providers:
| Provider | Target | Result |
|---|---|---|
| Cloudflare | Turnstile Demo | ✅ Solved Automatically |
| DataDome | Antoine Vastel Research | ✅ Accessed |
| Akamai | Nike Product Page | ✅ Bypassed |
📖 Documentation
Full technical guides are available in the docs/ folder:
Translations (Chinese, Japanese, etc.) coming soon.
💖 Support & Sponsorship
chuscraper is an open-source project maintained by [Toufiq Qureshi]. If the library has helped you or your business, please consider supporting its development:
- GitHub Sponsors: Sponsor me on GitHub
- Corporate Sponsorship: If you are a Proxy Provider or Data Company, we offer featured placement in our documentation. Contact us for partnership opportunities.
- Custom Scraping Solutions: Need a private, high-performance scraper? We offer professional consulting.
🛠️ Contributing
Want to contribute? Open an issue or send a pull request — all levels welcome! Please follow the CONTRIBUTING.md guidelines.
📜 License
Chuscraper is licensed under the AGPL-3.0 License. This ensures that any software using Chuscraper must also be open-source, protecting the community and your freedom.
Made with ❤️ by [Toufiq Qureshi]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chuscraper-0.19.2.tar.gz.
File metadata
- Download URL: chuscraper-0.19.2.tar.gz
- Upload date:
- Size: 1.8 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6adab30ed99c47e858a57de5598f9793d27f0c8e755d232ea34fef9e6576b351
|
|
| MD5 |
2889b65f473ecde93051d276c3503c59
|
|
| BLAKE2b-256 |
0470a596702f5f729d2c2bd1a5f6b84fb0065ef60c045b1167cd36ba390d1871
|
File details
Details for the file chuscraper-0.19.2-py3-none-any.whl.
File metadata
- Download URL: chuscraper-0.19.2-py3-none-any.whl
- Upload date:
- Size: 260.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
16926c4bdf295fcf18c2ef8c9923a6e0ac776315dace034b3b1eb7f86a33b64a
|
|
| MD5 |
17a3cc1e16254103d41df0f891774c62
|
|
| BLAKE2b-256 |
06a7878870115bc4e2d7569f51f4f11703a3805855088ba2c540ecf0206f2577
|