Official Python client for Scrapely.io API
Project description
Scrapely Python Client
The official Python client for Scrapely.io, a powerful API for web scraping and CAPTCHA solving.
Features
- Web Crawling: Scrape websites with advanced options like screenshotting, resource blocking, and custom instructions (click, scroll, type, wait).
- CAPTCHA Solving: Solve Google reCAPTCHA (V2/V3) and Cloudflare Turnstile/Challenge.
- Async Support: Fully asynchronous client (
AsyncScrapely) for high-concurrency applications. - Typed Responses: All API responses are returned as typed Python objects for better IDE support and type safety.
Installation
pip install scrapely-python-client
Usage
Synchronous Client
from scrapely import Scrapely
client = Scrapely(api_key="YOUR_API_KEY")
# 1. Crawl a website
response = client.crawler.crawl(
website_url="https://example.com",
return_page_text=True
)
print(response.result.text)
# 2. Solve reCAPTCHA V3
captcha = client.google.RecaptchaV3(
website_url="https://example.com",
website_key="6LdKlZEpAAAAAAOQjzC2v_d36tWxCl6dWsozdSy9"
)
print(captcha.result.solution)
Asynchronous Client
import asyncio
from scrapely import AsyncScrapely
async def main():
client = AsyncScrapely(api_key="YOUR_API_KEY")
# 1. Crawl a website asynchronously
response = await client.crawler.crawl(
website_url="https://example.com",
screenshot=True
)
print(f"Screenshot URL: {response.result.screenshot}")
# 2. Solve Cloudflare Turnstile asynchronously
captcha = await client.cloudflare.Turnstile(
website_url="https://example.com",
website_key="0x4AAAAAA..."
)
print(f"Token: {captcha.result.solution}")
if __name__ == "__main__":
asyncio.run(main())
Advanced Usage
Automation Instructions
You can pass a list of instructions to interact with the page before scraping.
from scrapely import Scrapely
from scrapely.models.types import SendKeys, Click, Wait
client = Scrapely(api_key="YOUR_API_KEY")
instructions = [
SendKeys(selector="#search", text="scraping"),
Click(selector="#submit-button"),
Wait(timeout=2) # Wait 2 seconds
]
response = client.crawler.crawl(
website_url="https://example.com",
instructions=instructions,
return_page_text=True
)
Response Objects
Responses are typed dataclasses:
CrawlerResponse: Containsresult(CrawlerResult) with fields likehtml,text,cookies,screenshot.CaptchaResponse: Containsresult(CaptchaResult) with thesolutionstring.
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scrapely_python_client-0.1.1.tar.gz.
File metadata
- Download URL: scrapely_python_client-0.1.1.tar.gz
- Upload date:
- Size: 12.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9456efcd5533be2feb4d542491ae697a3cb80eb05f2b698a1ff62f5d2d8afa4
|
|
| MD5 |
7190bc6799ec21b708b09265c879acf3
|
|
| BLAKE2b-256 |
02e8648a8719cd6fcebfbd03d99bcd90f7f6db3bddd107df85f9d1411c393ce3
|
File details
Details for the file scrapely_python_client-0.1.1-py3-none-any.whl.
File metadata
- Download URL: scrapely_python_client-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b5e75f0ef6ecd568d7b21c59111f50c6cc61d61eed3ef2eaf22efb9797d88b4
|
|
| MD5 |
d045c979aed609ddd7dfaf1287fda03c
|
|
| BLAKE2b-256 |
755e4c9152de25855d40417c6e0fae8e6a812d021b6b322c1020deb86a858ba9
|