The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.
Project description
Thordata Python SDK
The Official Python Client for Thordata APIs
Proxy Network • SERP API • Web Unlocker • Web Scraper API
📖 Introduction
This SDK provides a robust, high-performance interface to Thordata's AI data infrastructure. It is designed for high-concurrency scraping, reliable proxy tunneling, and seamless data extraction.
Key Features:
- 🚀 Production Ready: Built on
urllib3connection pooling for low-latency proxy requests. - ⚡ Async Support: Native
aiohttpclient for high-concurrency SERP/Universal scraping. - 🛡️ Robust: Handles TLS-in-TLS tunneling, retries, and error parsing automatically.
- ✨ Developer Experience: Fully typed (
mypycompatible) with intuitive IDE autocomplete. - 🧩 Lazy Validation: Only validate credentials for the features you actually use.
📦 Installation
pip install thordata-sdk
🔐 Configuration
Set environment variables to avoid hardcoding credentials. You only need to set the variables for the features you use.
# [Required for SERP & Web Unlocker]
export THORDATA_SCRAPER_TOKEN="your_token_here"
# [Required for Proxy Network]
export THORDATA_RESIDENTIAL_USERNAME="your_username"
export THORDATA_RESIDENTIAL_PASSWORD="your_password"
export THORDATA_PROXY_HOST="vpnXXXX.pr.thordata.net"
# [Required for Task Management]
export THORDATA_PUBLIC_TOKEN="public_token"
export THORDATA_PUBLIC_KEY="public_key"
🚀 Quick Start
1. SERP Search (Google/Bing/Yandex)
from thordata import ThordataClient, Engine
client = ThordataClient() # Loads THORDATA_SCRAPER_TOKEN from env
# Simple Search
print("Searching...")
results = client.serp_search("latest AI trends", engine=Engine.GOOGLE_NEWS)
for news in results.get("news_results", [])[:3]:
print(f"- {news['title']} ({news['source']})")
2. Universal Scrape (Web Unlocker)
Bypass Cloudflare/Akamai and render JavaScript automatically.
html = client.universal_scrape(
url="https://example.com/protected-page",
js_render=True,
wait_for=".content-loaded",
country="us"
)
print(f"Scraped {len(html)} bytes")
3. High-Performance Proxy
Use Thordata's residential IPs with automatic connection pooling.
from thordata import ProxyConfig, ProxyProduct
# Config is optional if env vars are set, but allows granular control
proxy = ProxyConfig(
product=ProxyProduct.RESIDENTIAL,
country="jp",
city="tokyo",
session_id="session-001",
session_duration=10 # Sticky IP for 10 mins
)
# Use the client to make requests (Reuses TCP connections)
response = client.get("https://httpbin.org/ip", proxy_config=proxy)
print(response.json())
⚙️ Advanced Usage
Async Client (High Concurrency)
For building AI agents or high-throughput spiders.
import asyncio
from thordata import AsyncThordataClient
async def main():
async with AsyncThordataClient() as client:
# Fire off multiple requests in parallel
tasks = [
client.serp_search(f"query {i}")
for i in range(5)
]
results = await asyncio.gather(*tasks)
print(f"Completed {len(results)} searches")
asyncio.run(main())
Web Scraper API (Task Management)
Create and manage large-scale scraping tasks asynchronously.
# 1. Create a task
task_id = client.create_scraper_task(
file_name="daily_scrape",
spider_id="universal",
spider_name="universal",
parameters={"url": "https://example.com"}
)
# 2. Wait for completion (Polling)
status = client.wait_for_task(task_id)
# 3. Get results
if status == "ready":
url = client.get_task_result(task_id)
print(f"Download Data: {url}")
📄 License
MIT License. See LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file thordata_sdk-1.3.0.tar.gz.
File metadata
- Download URL: thordata_sdk-1.3.0.tar.gz
- Upload date:
- Size: 59.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ae2eb50301c94a5e2855be344839cdcb6fb041bbbd0df065b0fd782398bcc7b6
|
|
| MD5 |
60b5420963791097b0ca65cfc071f56c
|
|
| BLAKE2b-256 |
42c050730729a256774756edbd60e7e9455ff085803d2e9ef8d19faf7eb661a5
|
Provenance
The following attestation bundles were made for thordata_sdk-1.3.0.tar.gz:
Publisher:
pypi-publish.yml on Thordata/thordata-python-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
thordata_sdk-1.3.0.tar.gz -
Subject digest:
ae2eb50301c94a5e2855be344839cdcb6fb041bbbd0df065b0fd782398bcc7b6 - Sigstore transparency entry: 829719438
- Sigstore integration time:
-
Permalink:
Thordata/thordata-python-sdk@8c81f6d9b4cdad22afd69514cdaa63d1feb53da0 -
Branch / Tag:
refs/tags/v1.3.0 - Owner: https://github.com/Thordata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@8c81f6d9b4cdad22afd69514cdaa63d1feb53da0 -
Trigger Event:
push
-
Statement type:
File details
Details for the file thordata_sdk-1.3.0-py3-none-any.whl.
File metadata
- Download URL: thordata_sdk-1.3.0-py3-none-any.whl
- Upload date:
- Size: 50.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d7523e319f2811bbac6633829f3652700ccdaa218601afd47d6161c382e4a4ef
|
|
| MD5 |
9972134dd7d067b7783f2729d3113ee1
|
|
| BLAKE2b-256 |
49a6aa1193735d4d8c2d2df94dfae26173fab83924b025f6dbf0f7195067f2ad
|
Provenance
The following attestation bundles were made for thordata_sdk-1.3.0-py3-none-any.whl:
Publisher:
pypi-publish.yml on Thordata/thordata-python-sdk
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
thordata_sdk-1.3.0-py3-none-any.whl -
Subject digest:
d7523e319f2811bbac6633829f3652700ccdaa218601afd47d6161c382e4a4ef - Sigstore transparency entry: 829719449
- Sigstore integration time:
-
Permalink:
Thordata/thordata-python-sdk@8c81f6d9b4cdad22afd69514cdaa63d1feb53da0 -
Branch / Tag:
refs/tags/v1.3.0 - Owner: https://github.com/Thordata
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@8c81f6d9b4cdad22afd69514cdaa63d1feb53da0 -
Trigger Event:
push
-
Statement type: