Async client for the webscrapper API
Project description
Webscrapper Client API
An asynchronous Python client for the Webscrapper API service. This client provides methods for retrieving web pages through proxies and checking URLs against the Russian internet regulator (RKN).
Features
- Fully asynchronous API built with aiohttp
- Support for both regular HTTP and Selenium-based web scraping
- Cookie management for both HTTP and Selenium requests
- Custom user agent and referer support
- Mobile and country-specific proxy support
- RKN checking functionality
- Context manager support for proper resource management
Installation
pip install webscrapper-client-api
Or install directly from the repository:
pip install git+https://github.com/yourusername/webscrapper-client-api.git
Usage
Basic Example
import asyncio
from webscrapper_client_api import WebscrapperClientAPI
async def main():
async with WebscrapperClientAPI("https://fetch.webnova.one", "your_api_key") as client:
# Basic page retrieval
result = await client.get_page(url="https://example.com")
print(f"Status: {result['status_code']}")
print(f"Content length: {len(result['html'])}")
# RKN check
rkn_result = await client.check_rkn(url="https://example.com")
print(f"RKN check result: {rkn_result}")
if __name__ == "__main__":
asyncio.run(main())
Using Cookies with Selenium
async with WebscrapperClientAPI("https://fetch.webnova.one", "your_api_key") as client:
# Define cookies for Selenium
cookies = [
{"name": "session_id", "value": "abc123"},
{"name": "user_preferences", "value": "dark_mode=1"}
]
# Request with Selenium and cookies
result = await client.get_page(
url="https://example.com/login",
use_selenium=True,
cookies=cookies
)
Using Cookies with Regular HTTP
async with WebscrapperClientAPI("https://fetch.webnova.one", "your_api_key") as client:
# Define cookies for HTTP request
cookies = {
"session_id": "abc123",
"user_preferences": "dark_mode=1"
}
# Request with HTTP and cookies
result = await client.get_page(
url="https://example.com/dashboard",
cookies=cookies,
user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
referer="https://example.com/login"
)
Manual Session Management
async def example():
# Create client
client = WebscrapperClientAPI("https://fetch.webnova.one", "your_api_key")
try:
# Make requests
result = await client.get_page(url="https://example.com")
finally:
# Always close the session when done
await client.close()
API Methods
get_page
Retrieves a web page through a proxy.
Parameters:
url(str): URL to retrieveuse_selenium(bool, optional): Use Selenium for request. Default: Falseuse_mobile(bool, optional): Use mobile proxy. Default: Falseuser_agent(str, optional): Custom User-Agent headerreferer(str, optional): Custom referer (not used for Selenium)method(str, optional): Request method, 'get' or 'head'. Default: 'get'country(int, optional): Proxy country IDcookies(dict or list, optional): Cookies to send with the request
Returns a dictionary with:
html: HTML content of the pagestatus_code: HTTP status codeurl: Final URL (may differ from requested URL after redirects)error: Error message if anyselenium: Boolean indicating if Selenium was used (only in Selenium responses)
check_rkn
Checks if a domain is blocked by RKN (Russian internet regulator).
Parameters:
url(str): URL to check
Returns a dictionary with the RKN check results.
Exception Handling
The client defines a custom exception WebscrapperAPIError for handling API errors:
try:
result = await client.get_page(url="https://example.com")
except WebscrapperAPIError as e:
print(f"API Error: {e.message}, Status code: {e.status_code}")
License
This project is licensed under the WTFPL License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file webscrapper_client_api-0.1.0.tar.gz.
File metadata
- Download URL: webscrapper_client_api-0.1.0.tar.gz
- Upload date:
- Size: 7.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
36c4c8924de512fe03bf9b28e09e8231355967f71558a247330c42b81d5a006c
|
|
| MD5 |
15cee0d0565332107cacafcb3abd22cf
|
|
| BLAKE2b-256 |
40ff131b58f13a3de14e7fcb57245be2705d6fbf2964f02bb7b7f5a857e53fd7
|
File details
Details for the file webscrapper_client_api-0.1.0-py3-none-any.whl.
File metadata
- Download URL: webscrapper_client_api-0.1.0-py3-none-any.whl
- Upload date:
- Size: 5.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
61706f6a37c7bba160154558b6fec76f59dc2693394b1dafa9cda49b0e1fa5b2
|
|
| MD5 |
c0f9b201b8ac00b5bb44c6235227714f
|
|
| BLAKE2b-256 |
9b8106bf27412f9e5ab96484f9391f57fc530bc965b911a07399e3d0e037da2b
|