Skip to main content

The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.

Project description

Thordata Python SDK

Thordata Logo

The Official Python Client for Thordata APIs

Proxy Network • SERP API • Web Unlocker • Web Scraper API

PyPI version Python Versions License CI Status


📖 Introduction

This SDK provides a robust, high-performance interface to Thordata's AI data infrastructure. It is designed for high-concurrency scraping, reliable proxy tunneling, and seamless data extraction.

Key Features:

  • 🚀 Production Ready: Built on urllib3 connection pooling for low-latency proxy requests.
  • ⚡ Async Support: Native aiohttp client for high-concurrency SERP/Universal scraping.
  • 🛡️ Robust: Handles TLS-in-TLS tunneling, retries, and error parsing automatically.
  • ✨ Developer Experience: Fully typed (mypy compatible) with intuitive IDE autocomplete.
  • 🧩 Lazy Validation: Only validate credentials for the features you actually use.

📦 Installation

pip install thordata-sdk

🔐 Configuration

Set environment variables to avoid hardcoding credentials. You only need to set the variables for the features you use.

# [Required for SERP & Web Unlocker]
export THORDATA_SCRAPER_TOKEN="your_token_here"

# [Required for Proxy Network]
export THORDATA_RESIDENTIAL_USERNAME="your_username"
export THORDATA_RESIDENTIAL_PASSWORD="your_password"
export THORDATA_PROXY_HOST="vpnXXXX.pr.thordata.net"

# [Required for Task Management]
export THORDATA_PUBLIC_TOKEN="public_token"
export THORDATA_PUBLIC_KEY="public_key"

🚀 Quick Start

1. SERP Search (Google/Bing/Yandex)

from thordata import ThordataClient, Engine

client = ThordataClient()  # Loads THORDATA_SCRAPER_TOKEN from env

# Simple Search
print("Searching...")
results = client.serp_search("latest AI trends", engine=Engine.GOOGLE_NEWS)

for news in results.get("news_results", [])[:3]:
    print(f"- {news['title']} ({news['source']})")

2. Universal Scrape (Web Unlocker)

Bypass Cloudflare/Akamai and render JavaScript automatically.

html = client.universal_scrape(
    url="https://example.com/protected-page",
    js_render=True,
    wait_for=".content-loaded",
    country="us"
)
print(f"Scraped {len(html)} bytes")

3. High-Performance Proxy

Use Thordata's residential IPs with automatic connection pooling.

from thordata import ProxyConfig, ProxyProduct

# Config is optional if env vars are set, but allows granular control
proxy = ProxyConfig(
    product=ProxyProduct.RESIDENTIAL,
    country="jp",
    city="tokyo",
    session_id="session-001",
    session_duration=10  # Sticky IP for 10 mins
)

# Use the client to make requests (Reuses TCP connections)
response = client.get("https://httpbin.org/ip", proxy_config=proxy)
print(response.json())

⚙️ Advanced Usage

Async Client (High Concurrency)

For building AI agents or high-throughput spiders.

import asyncio
from thordata import AsyncThordataClient

async def main():
    async with AsyncThordataClient() as client:
        # Fire off multiple requests in parallel
        tasks = [
            client.serp_search(f"query {i}") 
            for i in range(5)
        ]
        results = await asyncio.gather(*tasks)
        print(f"Completed {len(results)} searches")

asyncio.run(main())

Web Scraper API (Task Management)

Create and manage large-scale scraping tasks asynchronously.

# 1. Create a task
task_id = client.create_scraper_task(
    file_name="daily_scrape",
    spider_id="universal",
    spider_name="universal",
    parameters={"url": "https://example.com"}
)

# 2. Wait for completion (Polling)
status = client.wait_for_task(task_id)

# 3. Get results
if status == "ready":
    url = client.get_task_result(task_id)
    print(f"Download Data: {url}")

📄 License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thordata_sdk-1.3.0.tar.gz (59.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thordata_sdk-1.3.0-py3-none-any.whl (50.0 kB view details)

Uploaded Python 3

File details

Details for the file thordata_sdk-1.3.0.tar.gz.

File metadata

  • Download URL: thordata_sdk-1.3.0.tar.gz
  • Upload date:
  • Size: 59.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.3.0.tar.gz
Algorithm Hash digest
SHA256 ae2eb50301c94a5e2855be344839cdcb6fb041bbbd0df065b0fd782398bcc7b6
MD5 60b5420963791097b0ca65cfc071f56c
BLAKE2b-256 42c050730729a256774756edbd60e7e9455ff085803d2e9ef8d19faf7eb661a5

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.3.0.tar.gz:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thordata_sdk-1.3.0-py3-none-any.whl.

File metadata

  • Download URL: thordata_sdk-1.3.0-py3-none-any.whl
  • Upload date:
  • Size: 50.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d7523e319f2811bbac6633829f3652700ccdaa218601afd47d6161c382e4a4ef
MD5 9972134dd7d067b7783f2729d3113ee1
BLAKE2b-256 49a6aa1193735d4d8c2d2df94dfae26173fab83924b025f6dbf0f7195067f2ad

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.3.0-py3-none-any.whl:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page