Skip to main content

The Official Python SDK for Thordata - AI Data Infrastructure & Proxy Network.

Project description

Thordata Python SDK

Thordata Logo

The Official Python Client for Thordata APIs

Proxy Network • SERP API • Web Unlocker • Web Scraper API

PyPI version Python Versions License CI Status


📖 Introduction

This SDK provides a robust, high-performance interface to Thordata's AI data infrastructure. It is designed for high-concurrency scraping, reliable proxy tunneling, and seamless data extraction.

Key Features:

  • 🚀 Production Ready: Built on urllib3 connection pooling for low-latency proxy requests.
  • ⚡ Async Support: Native aiohttp client for high-concurrency SERP/Universal scraping.
  • 🛡️ Robust: Handles TLS-in-TLS tunneling, retries, and error parsing automatically.
  • ✨ Developer Experience: Fully typed (mypy compatible) with intuitive IDE autocomplete.
  • 🧩 Lazy Validation: Only validate credentials for the features you actually use.

📦 Installation

pip install thordata-sdk

🔐 Configuration

Set environment variables to avoid hardcoding credentials. You only need to set the variables for the features you use.

# [Required for SERP & Web Unlocker]
export THORDATA_SCRAPER_TOKEN="your_token_here"

# [Required for Proxy Network]
export THORDATA_RESIDENTIAL_USERNAME="your_username"
export THORDATA_RESIDENTIAL_PASSWORD="your_password"
export THORDATA_PROXY_HOST="vpnXXXX.pr.thordata.net"

# [Required for Task Management]
export THORDATA_PUBLIC_TOKEN="public_token"
export THORDATA_PUBLIC_KEY="public_key"

🚀 Quick Start

1. SERP Search (Google/Bing/Yandex)

from thordata import ThordataClient, Engine

client = ThordataClient()  # Loads THORDATA_SCRAPER_TOKEN from env

# Simple Search
print("Searching...")
results = client.serp_search("latest AI trends", engine=Engine.GOOGLE_NEWS)

for news in results.get("news_results", [])[:3]:
    print(f"- {news['title']} ({news['source']})")

2. Universal Scrape (Web Unlocker)

Bypass Cloudflare/Akamai and render JavaScript automatically.

html = client.universal_scrape(
    url="https://example.com/protected-page",
    js_render=True,
    wait_for=".content-loaded",
    country="us"
)
print(f"Scraped {len(html)} bytes")

3. High-Performance Proxy

Use Thordata's residential IPs with automatic connection pooling.

from thordata import ProxyConfig, ProxyProduct

# Config is optional if env vars are set, but allows granular control
proxy = ProxyConfig(
    product=ProxyProduct.RESIDENTIAL,
    country="jp",
    city="tokyo",
    session_id="session-001",
    session_duration=10  # Sticky IP for 10 mins
)

# Use the client to make requests (Reuses TCP connections)
response = client.get("https://httpbin.org/ip", proxy_config=proxy)
print(response.json())

⚙️ Advanced Usage

Async Client (High Concurrency)

For building AI agents or high-throughput spiders.

import asyncio
from thordata import AsyncThordataClient

async def main():
    async with AsyncThordataClient() as client:
        # Fire off multiple requests in parallel
        tasks = [
            client.serp_search(f"query {i}") 
            for i in range(5)
        ]
        results = await asyncio.gather(*tasks)
        print(f"Completed {len(results)} searches")

asyncio.run(main())

Web Scraper API (Task Management)

Create and manage large-scale scraping tasks asynchronously.

# 1. Create a task
task_id = client.create_scraper_task(
    file_name="daily_scrape",
    spider_id="universal",
    spider_name="universal",
    parameters={"url": "https://example.com"}
)

# 2. Wait for completion (Polling)
status = client.wait_for_task(task_id)

# 3. Get results
if status == "ready":
    url = client.get_task_result(task_id)
    print(f"Download Data: {url}")

📄 License

MIT License. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

thordata_sdk-1.2.0.tar.gz (58.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

thordata_sdk-1.2.0-py3-none-any.whl (49.0 kB view details)

Uploaded Python 3

File details

Details for the file thordata_sdk-1.2.0.tar.gz.

File metadata

  • Download URL: thordata_sdk-1.2.0.tar.gz
  • Upload date:
  • Size: 58.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.2.0.tar.gz
Algorithm Hash digest
SHA256 0fdfdbee86ae36f287ecd3299f6fc3f490d86cf769ed2fb4546fcc682d7ed708
MD5 a838cce8c4753c0453d9931c02aa9dea
BLAKE2b-256 bef8f149157ed5981f0e0be744f400c67539d84212968881919f34636ea43f0c

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.2.0.tar.gz:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file thordata_sdk-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: thordata_sdk-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 49.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for thordata_sdk-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 1ec1e115eece5b578c299f54983ecfcc22a02c32cdae9340d56c77ae6e4cc666
MD5 743793a04a8ff6fa7664221fde6c230c
BLAKE2b-256 ac7f1555abacbcf41b822a5d61b326903a883dc5d8568c02e7719678052eb519

See more details on using hashes here.

Provenance

The following attestation bundles were made for thordata_sdk-1.2.0-py3-none-any.whl:

Publisher: pypi-publish.yml on Thordata/thordata-python-sdk

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page