Skip to main content

Official Python SDK for ScrapeBadger - Async web scraping APIs for Twitter and more

Project description

ScrapeBadger

ScrapeBadger Python SDK

PyPI version Python versions License Tests Coverage Code style: ruff Type checked: mypy

The official Python SDK for ScrapeBadger - async web scraping APIs for Twitter, Google, Vinted, Reddit, and more.

Features

  • Async-first - Built with asyncio for high-performance concurrent scraping
  • Type-safe - Full type hints and Pydantic models for all responses
  • Automatic pagination - Iterator methods with smart rate limit handling
  • Resilient retries - Exponential backoff on transient errors
  • 37+ Twitter endpoints - Tweets, users, lists, communities, trends, geo, real-time streams
  • 19 Google product APIs - Search (with optional deferred AI Overview follow-up), Maps, News, Hotels, Trends (incl. topic autocomplete), Jobs, Shopping (+ merchant URL enrichment), Patents, Scholar (search + profiles + author + author citation + cite formats), Images, Videos, Finance, AI Mode, Lens, Local Pack, Shorts, Flights, Products
  • Vinted scraping - Search items, item details, user profiles, brands, colors, markets
  • Reddit scraping - Search posts/subreddits/users/domains, subreddit posts, post comments, user profiles, trophies, wiki pages, moderators
  • Web scraping - Anti-bot bypass, JS rendering, and AI data extraction

Installation

pip install scrapebadger

Or with uv:

uv add scrapebadger

Quick Start

import asyncio
from scrapebadger import ScrapeBadger

async def main():
    async with ScrapeBadger(api_key="your-api-key") as client:
        # Get a user profile
        user = await client.twitter.users.get_by_username("elonmusk")
        print(f"{user.name} has {user.followers_count:,} followers")

        # Scrape a website
        result = await client.web.scrape("https://scrapebadger.com", format="markdown")
        print(result.content)

        # Search tweets
        tweets = await client.twitter.tweets.search("python programming")
        for tweet in tweets.data:
            print(f"@{tweet.username}: {tweet.text[:100]}...")

asyncio.run(main())

Authentication

Get your API key from scrapebadger.com and pass it to the client:

from scrapebadger import ScrapeBadger

client = ScrapeBadger(api_key="sb_live_xxxxxxxxxxxxx")

You can also set the SCRAPEBADGER_API_KEY environment variable:

export SCRAPEBADGER_API_KEY="sb_live_xxxxxxxxxxxxx"

Available APIs

API Description Documentation
Web Scraping Scrape any website with JS rendering, anti-bot bypass, and AI extraction Web Scraping Guide
Twitter 37+ endpoints for tweets, users, lists, communities, trends, and real-time streams Twitter Guide
Google 19 products — Search, Maps, News, Hotels, Trends, Jobs, Shopping, Patents, Scholar, Images, Videos, Finance, AI Mode, Lens, Autocomplete, Local, Shorts, Flights, Products Google Guide
Vinted Search items, item details, user profiles, brands, colors, statuses, and markets Vinted Guide
Reddit Search posts, subreddits, users, and domains; fetch post comments, subreddit rules, moderators, wiki pages, user trophies Reddit Guide
Amazon 14 endpoints — search, autocomplete, product detail, offers, reviews, bestsellers, new releases, deals, category browse, seller profile/products/feedback, markets, categories Amazon Guide
eBay 12 endpoints across 18 markets — search, completed/sold search, item detail, item reviews, seller profile/items/feedback, category browse, categories, autocomplete, markets eBay Guide
YouTube 39 endpoints — search, autocomplete, video detail/related/comments/replies/transcript/captions/streams/live-chat/batch, channel detail + videos/shorts/streams/playlists/community/about/subscriber-count/in-channel-search/resolve, playlists/items/mixes, trending/hashtag/home, shorts, community post/comments, music search, oembed, categories/languages/regions YouTube Guide
TikTok 25 endpoints — user profile/videos/followers/following/liked/reposts, video detail/comments/replies/related/transcript/oEmbed, search (general/videos/hashtags/users), music detail/videos, hashtag detail/videos, trending videos/hashtags/songs, ad library, regions TikTok Guide

Error Handling

from scrapebadger import (
    ScrapeBadger,
    ScrapeBadgerError,
    AuthenticationError,
    RateLimitError,
    InsufficientCreditsError,
    NotFoundError,
    ValidationError,
    ServerError,
)

async with ScrapeBadger(api_key="your-key") as client:
    try:
        user = await client.twitter.users.get_by_username("elonmusk")
    except AuthenticationError:
        print("Invalid API key")
    except RateLimitError as e:
        print(f"Rate limited. Retry after {e.retry_after} seconds")
        print(f"Limit: {e.limit}, Remaining: {e.remaining}")
    except InsufficientCreditsError:
        print("Out of credits! Purchase more at scrapebadger.com")
    except NotFoundError:
        print("User not found")
    except ValidationError as e:
        print(f"Invalid parameters: {e}")
    except ServerError:
        print("Server error, try again later")
    except ScrapeBadgerError as e:
        print(f"API error: {e}")

Configuration

Custom Timeout and Retries

from scrapebadger import ScrapeBadger

client = ScrapeBadger(
    api_key="your-key",
    timeout=120.0,      # Request timeout in seconds (default: 300)
    max_retries=5,      # Retry attempts (default: 10)
)

Advanced Configuration

from scrapebadger import ScrapeBadger
from scrapebadger._internal import ClientConfig

config = ClientConfig(
    api_key="your-key",
    base_url="https://scrapebadger.com",
    timeout=300.0,
    connect_timeout=10.0,
    max_retries=10,
    retry_on_status=(502, 503, 504),
    headers={"X-Custom-Header": "value"},
)

client = ScrapeBadger(config=config)

Retry Behavior

The SDK automatically retries requests that fail with 502, 503, or 504 status codes using exponential backoff (1s, 2s, 4s, 8s, ...). Each retry logs a warning:

⚠ 503 Service Unavailable — retrying in 4s (attempt 3/10)

To see these warnings, configure Python logging:

import logging
logging.basicConfig(level=logging.WARNING)

Rate Limit Aware Pagination

When using *_all pagination methods, the SDK reads X-RateLimit-Remaining and X-RateLimit-Reset headers from each response. When remaining requests drop below 20% of your tier's limit, pagination automatically slows down to spread requests across the remaining window — preventing 429 errors. A warning is logged when throttling activates:

⚠ Rate limit: 25/300 remaining (resets in 42s), throttling pagination to ~0.6 req/s

This works transparently with all tier levels (Free: 60/min, Basic: 300/min, Pro: 1000/min, Enterprise: 5000/min).

Development

Setup

# Clone the repository
git clone https://github.com/scrape-badger/scrapebadger-python.git
cd scrapebadger-python

# Install dependencies with uv
uv sync --dev

# Install pre-commit hooks
uv run pre-commit install

Running Tests

# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov=src/scrapebadger --cov-report=html

# Run specific tests
uv run pytest tests/test_client.py -v

Code Quality

# Lint
uv run ruff check src/ tests/

# Format
uv run ruff format src/ tests/

# Type check
uv run mypy src/

# All checks
uv run ruff check src/ tests/ && uv run ruff format --check src/ tests/ && uv run mypy src/

Contributing

Contributions are welcome! Please read our Contributing Guide for details.

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests and linting (uv run pytest && uv run ruff check)
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support


Made with ❤️ by ScrapeBadger

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapebadger-0.15.3.tar.gz (113.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scrapebadger-0.15.3-py3-none-any.whl (173.9 kB view details)

Uploaded Python 3

File details

Details for the file scrapebadger-0.15.3.tar.gz.

File metadata

  • Download URL: scrapebadger-0.15.3.tar.gz
  • Upload date:
  • Size: 113.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scrapebadger-0.15.3.tar.gz
Algorithm Hash digest
SHA256 e6f9299ad9635a4a6bd26e30488e2edca2aea67d368e597ffe7640231e2e0b15
MD5 8f03970af83a8c3bfeb479371baa68b2
BLAKE2b-256 b9143b3e7444f6659273e41ecf33a00000df0552ffedb821532972c510a1a426

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapebadger-0.15.3.tar.gz:

Publisher: publish.yml on scrape-badger/scrapebadger-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file scrapebadger-0.15.3-py3-none-any.whl.

File metadata

  • Download URL: scrapebadger-0.15.3-py3-none-any.whl
  • Upload date:
  • Size: 173.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for scrapebadger-0.15.3-py3-none-any.whl
Algorithm Hash digest
SHA256 dd15dd83631bd4a2c930df40f22ea66109edec504a4b66fada0e75b134c2614e
MD5 1c14fd16d58cb61f9321628315e5e6d3
BLAKE2b-256 ce9e7546fc7bd250043b3b66c40600d73328983ef7b82be0f55ec77228f67890

See more details on using hashes here.

Provenance

The following attestation bundles were made for scrapebadger-0.15.3-py3-none-any.whl:

Publisher: publish.yml on scrape-badger/scrapebadger-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page