A blazing fast, async-first, undetectable webscraping/web automation framework
Project description
🕷️ Chuscraper
LLM + CDP powered stealth-focused web scraping & automation framework
You Only Scrape Once — data extraction made smarter, faster, and more resilient.
🚀 What is Chuscraper?
Chuscraper is a Python web scraping & automation library that uses CDP (Chrome DevTools Protocol) and LLMs to extract structured data, interact with pages, and automate workflows — with a heavy focus on Anti-Detection and Stealth.
With AI-powered extraction, you tell it what to extract — it figures out how.
🌟 Features
🕵️♂️ Stealth & Anti-Detection
- Hides
navigator.webdriver, user agent rotation - Canvas/WebGL noise + hardware spoofing
- Timezone & geolocation spoofing
🤖 AI-Driven Data Extraction
- Semantic extraction using LLMs
- Converts HTML into structured JSON/Pydantic
🧠 Autonomous Navigation
- Intelligent pilot (
ai_pilot) that clicks/types until goal achieved
⚡ Async + Fast
Built on async CDP, low overhead, no heavy browser bundles.
🔄 Flexible Outputs
Supports JSON, CSV, Markdown, Excel, Pydantic, and more.
🌐 Integrations
- LLM Providers: OpenAI, Gemini, Anthropic, Ollama
- Frameworks: LangChain, LlamaIndex, Agno, Crew.ai
📦 Installation
pip install chuscraper
# For AI Capabilities
pip install chuscraper[ai]
[!TIP] Use within a virtual environment to avoid conflicts.
💻 Quick Start (Async)
import asyncio
from chuscraper import start
async def main():
browser = await start(headless=False)
page = await browser.get("https://www.makemytrip.com/")
# Tell the AI what to extract
print("AI is navigating...")
await page.ai_pilot("Search hotels in Goa for next weekend")
# Extract structured data
result = await page.ai_extract("Get the first 3 hotels with prices")
import json
print(json.dumps(result, indent=2))
await browser.stop()
if __name__ == "__main__":
asyncio.run(main())
🤖 AI Usage with Providers
Chuscraper supports multiple providers out-of-the-box.
1. Gemini (Native)
from chuscraper.ai.providers import GeminiProvider
provider = GeminiProvider(api_key="YOUR_GEMINI_API_KEY")
await page.ai_extract("Extract data", provider=provider)
2. OpenAI
from chuscraper.ai.providers import OpenAIProvider
provider = OpenAIProvider(api_key="YOUR_OPENAI_API_KEY")
await page.ai_extract("Extract data", provider=provider)
3. Local LLMs (via Ollama)
from chuscraper.ai.providers import OllamaProvider
# Uses Ollama's OpenAI-compatible API (default: localhost:11434)
provider = OllamaProvider(model_name="llama3")
await page.ai_extract("Extract data", provider=provider)
📖 Documentation
Full technical guides are available in the docs/ folder:
Translations (Chinese, Japanese, etc.) coming soon.
🛠️ Contributing
Want to contribute? Open an issue or send a pull request — all levels welcome! Please follow the CONTRIBUTING.md guidelines.
📜 License
Chuscraper is licensed under the MIT License.
Made with ❤️ by [Toufiq Qureshi]
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chuscraper-0.16.3.tar.gz.
File metadata
- Download URL: chuscraper-0.16.3.tar.gz
- Upload date:
- Size: 444.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6c231b436bfaa22fdf02e102c93c53a1ff2d73a6d81b93d2090470590278c432
|
|
| MD5 |
bc7e72158912bf86598773983587330e
|
|
| BLAKE2b-256 |
18aeaaa207e446f05f399fb88ec8378bb8c36a4ff56642bb682a350d9bb5ec3c
|
File details
Details for the file chuscraper-0.16.3-py3-none-any.whl.
File metadata
- Download URL: chuscraper-0.16.3-py3-none-any.whl
- Upload date:
- Size: 371.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7ef6d29121025878fa39b19dd6b9c040bf24ed1f83769a103ef2c46d27ad0a5
|
|
| MD5 |
da6d8e55c421a660e4dc44795624d9d5
|
|
| BLAKE2b-256 |
a9b6e6f973bd653332ccdef29626302783831b5e11bfe062a6e778c113f0e7e1
|