A clean, reusable CAPTCHA-bypassing and scraping client
Project description
pylcaptcha
pylcaptcha is a modern, high-performance, and stealthy Python library designed to bypass modern anti-bot protections (such as Cloudflare Turnstile and Google Recaptcha V2/V3) while seamlessly integrating with high-speed HTTP clients for scraping and API automation.
By combining browser-based kinematic solvers (Playwright/Camoufox) with asynchronous HTTP clients (curl-cffi), pylcaptcha allows you to solve interactive verification challenges natively and reuse those sessions for lightweight API requests.
🚀 Features
- Double-Stage Challenge Solver: Automatically detects and resolves Cloudflare interstitial gateways and embedded form CAPTCHA widgets.
- Human-like Kinematics: Uses physics-driven, organic mouse movements (Bézier curves via
ShyMouse) to interact with verification challenges, defeating behavioral analysis. - Session & State Synchronization: Automatically extracts cookies, user-agents, and dynamically generated CSRF tokens to keep lightweight HTTP sessions authenticated.
- Asynchronous Architecture: Built from the ground up on
asynciofor maximum throughput. - Universal & Extensible: Provides generic token extraction hooks that allow integration with any web framework (e.g., Laravel, Django).
📦 Installation
You can install pylcaptcha directly from PyPI via pip:
pip install pylcaptcha
(Ensure you have your system's compatible dependencies or virtual environments set up).
Quick Start Guide
There are two modules exported:
from pylcaptcha.captcha import Captcha
from pylcaptcha.http import BrowserHTTP
The Captcha class is raw captcha solver. To use you pass into the Camoufox page and call proper method
The BrowserHTTP class is abstraction over Captcha specifically designed for Cloudflare captchas. It allows you to bypass it once with re-using of existing token
import asyncio
from pylcaptcha.http import BrowserHTTP
async def main():
# Initialize the engine utilizing an asynchronous context manager.
# This guarantees that the browser and connection pools are safely turn down.
async with BrowserHTTP() as protocol:
# Step 1: Open the landing page, solve the Turnstile challenge, and sync cookies.
# this will auto solve both turnstile on page load and embedded into page captcha
print("Solving captcha on loading...")
await protocol.get('https://my_cf_page', browser=True)
# Step 2: Surgically extract the CSRF token from the DOM
# and attach it securely to outgoing request headers.
# NOTE: this is different between backends
await protocol.sync_csrf_token(
selector='meta[name="csrf-token"]',
header_name='x-csrf-token'
)
# Step 3: Fire an authenticated query payload.
# Attach the single-use resolution token obtained from the browser step.
result = await protocol.post('https://my_cf_page/api/v1/check', data={
'query': 'human_only_query',
'token': await protocol.get_cf_token()
})
print("Response Data:")
print(result.text)
if __name__ == '__main__':
asyncio.run(main())
How It Works
Passive vs Active Challenges: The library differentiates between Stage 1 (gateway interstitial blocks) and Stage 2 (embedded in-page widgets like the IMEI form check), solving them sequentially when necessary.
Token Lifecycles: - Cloudflare clearance (cf_clearance) cookies are persistent and can be reused organically for session longevity.
CAPTCHA tokens (like Turnstile challenge signatures) are strictly single-use and generated just-in-time via browser DOM interaction.
CSRF Extraction: Generic evaluators locate CSRF meta tags or hidden inputs, mapping them perfectly to headers or cookies according to target framework requirements.
Contributing
Contributions are welcome! If you find bugs, encounter new anti-bot defense mechanisms, or want to add framework integrations, please open an issue or submit a pull request. Note that gcaptcha solver is not yet production stable and i need help with AI development and solving improvements
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pylcaptcha-1.0.0.tar.gz.
File metadata
- Download URL: pylcaptcha-1.0.0.tar.gz
- Upload date:
- Size: 69.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
832606b9b8127a65e388c34b8021e2612d6982c283c93cf5657f2fc548f7fd50
|
|
| MD5 |
315d8e5dffc4b24f3b3812943fd939fe
|
|
| BLAKE2b-256 |
6e9ef6fb86877c4a756ece5f09538d3b27b77a0875145d5cda307e2d939a46b8
|
File details
Details for the file pylcaptcha-1.0.0-py3-none-any.whl.
File metadata
- Download URL: pylcaptcha-1.0.0-py3-none-any.whl
- Upload date:
- Size: 69.6 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f949fa71037014484f8b353c9244964886feaf07709afdafa2ab1c4cdbcf0790
|
|
| MD5 |
76688337b909c850fe7d2f69c1699fca
|
|
| BLAKE2b-256 |
96e14216dc987ce6e04bf3568ed5a7e31fa6dcb2b5f0383ba025806e661d7e3e
|