Skip to main content

The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.

Project description

Thordata Python SDK

Thordata Logo

The Official Python Client for Thordata APIs

Proxy Network • SERP API • Web Unlocker • Web Scraper API

PyPI version Python Versions License CI Status


📖 Introduction

This SDK provides a robust, high-performance interface to Thordata's AI data infrastructure. It is designed for high-concurrency scraping, reliable proxy tunneling, and seamless data extraction.

Key Features:

  • 🚀 Production Ready: Built on urllib3 connection pooling for low-latency proxy requests.
  • ⚡ Async Support: Native aiohttp client for high-concurrency SERP/Universal scraping.
  • 🛡️ Robust: Handles TLS-in-TLS tunneling, retries, and error parsing automatically.
  • ✨ Developer Experience: Fully typed (mypy compatible) with intuitive IDE autocomplete.
  • 🧩 Lazy Validation: Only validate credentials for the features you actually use.

📦 Installation

pip install thordata-sdk

🔐 Configuration

Set environment variables to avoid hardcoding credentials. You only need to set the variables for the features you use.

# [Required for SERP & Web Unlocker]
export THORDATA_SCRAPER_TOKEN="your_token_here"

# [Required for Proxy Network]
export THORDATA_RESIDENTIAL_USERNAME="your_username"
export THORDATA_RESIDENTIAL_PASSWORD="your_password"
export THORDATA_PROXY_HOST="vpnXXXX.pr.thordata.net"

# [Required for Task Management]
export THORDATA_PUBLIC_TOKEN="public_token"
export THORDATA_PUBLIC_KEY="public_key"

🚀 Quick Start

1. SERP Search (Google/Bing/Yandex)

from thordata import ThordataClient, Engine

client = ThordataClient()  # Loads THORDATA_SCRAPER_TOKEN from env

# Simple Search
print("Searching...")
results = client.serp_search("latest AI trends", engine=Engine.GOOGLE_NEWS)

for news in results.get("news_results", [])[:3]:
    print(f"- {news['title']} ({news['source']})")

2. Universal Scrape (Web Unlocker)

Bypass Cloudflare/Akamai and render JavaScript automatically.

html = client.universal_scrape(
    url="https://example.com/protected-page",
    js_render=True,
    wait_for=".content-loaded",
    country="us"
)
print(f"Scraped {len(html)} bytes")

3. High-Performance Proxy

Use Thordata's residential IPs with automatic connection pooling.

from thordata import ProxyConfig, ProxyProduct

# Config is optional if env vars are set, but allows granular control
proxy = ProxyConfig(
    product=ProxyProduct.RESIDENTIAL,
    country="jp",
    city="tokyo",
    session_id="session-001",
    session_duration=10  # Sticky IP for 10 mins
)

# Use the client to make requests (Reuses TCP connections)
response = client.get("https://httpbin.org/ip", proxy_config=proxy)
print(response.json())

⚙️ Advanced Usage

Async Client (High Concurrency)

For building AI agents or high-throughput spiders.

import asyncio
from thordata import AsyncThordataClient

async def main():
    async with AsyncThordataClient() as client:
        # Fire off multiple requests in parallel
        tasks = [
            client.serp_search(f"query {i}") 
            for i in range(5)
        ]
        results = await asyncio.gather(*tasks)
        print(f"Completed {len(results)} searches")

asyncio.run(main())

Web Scraper API (Task Management)

Create and manage large-scale scraping tasks asynchronously.

# 1. Create a task
task_id = client.create_scraper_task(
    file_name="daily_scrape",
    spider_id="universal",
    spider_name="universal",
    parameters={"url": "https://example.com"}
)

# 2. Wait for completion (Polling)
status = client.wait_for_task(task_id)

# 3. Get results
if status == "ready":
    url = client.get_task_result(task_id)
    print(f"Download Data: {url}")

📄 License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thordata_sdk-1.4.0.tar.gz (68.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thordata_sdk-1.4.0-py3-none-any.whl (60.4 kB view details)

Uploaded Python 3

File details

Details for the file thordata_sdk-1.4.0.tar.gz.

File metadata

  • Download URL: thordata_sdk-1.4.0.tar.gz
  • Upload date:
  • Size: 68.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.4.0.tar.gz
Algorithm Hash digest
SHA256 2665fd1b141c855aad476c40e798737f180ea621245ecf2520c11f882560594b
MD5 7facd9069e6278a764f6c9558290123a
BLAKE2b-256 329aee67512339c3e1492b12546eef79e44043bd75bc767848235d488135b883

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.4.0.tar.gz:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thordata_sdk-1.4.0-py3-none-any.whl.

File metadata

  • Download URL: thordata_sdk-1.4.0-py3-none-any.whl
  • Upload date:
  • Size: 60.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a682084410c5a4da7e0df1b67c0fcb52fe6f2245357203c0f6bdc6e85424c72
MD5 e6a16aceb2e69c2c9eaee82bbcaec169
BLAKE2b-256 1c425fadf6eec536f90cbe54a9351f62e5dcb1c4a1e3ed1f1ddbcaf7982d2358

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.4.0-py3-none-any.whl:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page