Skip to main content

Async web search client using Brave Search API with built-in caching, rate limiting, and batch processing

Project description

Searcherator

Searcherator is a Python package that provides a convenient way to perform web searches using the Brave Search API with built-in caching, automatic rate limiting, and efficient batch processing capabilities.

Features

  • Async/await support for modern Python applications
  • Automatic caching with configurable TTL
  • Built-in rate limiting to respect API quotas
  • Efficient batch processing for multiple concurrent searches
  • Support for multiple languages and countries
  • Comprehensive exception hierarchy for robust error handling
  • Real-time rate limit tracking and monitoring

Installation

pip install searcherator

Requirements

Quick Start

from searcherator import Searcherator
import asyncio

async def main():
    # Basic search
    search = Searcherator("Python programming")
    
    # Get URLs from search results
    urls = await search.urls()
    print(urls)
    
    # Get detailed results
    results = await search.detailed_search_result()
    for result in results:
        print(f"{result['title']}: {result['url']}")
    
    # Clean up
    await Searcherator.close_session()

if __name__ == "__main__":
    asyncio.run(main())

Usage Examples

Basic Search

from searcherator import Searcherator
import asyncio

async def main():
    search = Searcherator("Python tutorials", num_results=10)
    results = await search.search_result()
    print(results)
    await Searcherator.close_session()

asyncio.run(main())

Localized Search

# German search
german_search = Searcherator(
    "Zusammenfassung Buch 'Demian' von 'Hermann Hesse'",
    language="de",
    country="de",
    num_results=10
)
results = await german_search.search_result()

Batch Processing

import asyncio
from searcherator import Searcherator

async def batch_search():
    queries = ["Python", "JavaScript", "Rust", "Go", "TypeScript"]
    
    try:
        # Create search instances
        searches = [Searcherator(q, num_results=5) for q in queries]
        
        # Run all searches concurrently (rate limiting handled automatically)
        results = await asyncio.gather(
            *[s.search_result() for s in searches],
            return_exceptions=True
        )
        
        # Process results
        for query, result in zip(queries, results):
            if isinstance(result, dict):
                print(f"{query}: {len(result.get('web', {}).get('results', []))} results")
    finally:
        await Searcherator.close_session()

asyncio.run(batch_search())

Error Handling

from searcherator import (
    Searcherator,
    SearcheratorAuthError,
    SearcheratorRateLimitError,
    SearcheratorTimeoutError,
    SearcheratorAPIError
)

async def safe_search():
    try:
        search = Searcherator("Python", timeout=10)
        results = await search.search_result()
    except SearcheratorAuthError:
        print("Invalid API key")
    except SearcheratorRateLimitError as e:
        print(f"Rate limited. Resets in {e.reset_per_second}s")
    except SearcheratorTimeoutError:
        print("Request timed out")
    except SearcheratorAPIError as e:
        print(f"API error: {e.status_code} - {e.message}")
    finally:
        await Searcherator.close_session()

Monitoring Rate Limits

search = Searcherator("Python")
results = await search.search_result()

print(f"Rate limit (per second): {search.rate_limit_per_second}")
print(f"Remaining (per second): {search.rate_remaining_per_second}")
print(f"Rate limit (per month): {search.rate_limit_per_month}")
print(f"Remaining (per month): {search.rate_remaining_per_month}")

API Reference

Searcherator

Searcherator(
    search_term: str = "",
    num_results: int = 5,
    country: str | None = "us",
    language: str | None = "en",
    api_key: str | None = None,
    spellcheck: bool = False,
    timeout: int = 30,
    clear_cache: bool = False,
    ttl: int = 7,
    logging: bool = False
)

Parameters

  • search_term (str): The query string to search for
  • num_results (int): Maximum number of results to return (default: 5)
  • country (str): Country code for search results (default: "us")
  • language (str): Language code for search results (default: "en")
  • api_key (str): Brave Search API key (default: None, uses BRAVE_API_KEY environment variable)
  • spellcheck (bool): Enable spell checking on queries (default: False)
  • timeout (int): Request timeout in seconds (default: 30)
  • clear_cache (bool): Clear existing cached results (default: False)
  • ttl (int): Time-to-live for cached results in days (default: 7)
  • logging (bool): Enable cache operation logging (default: False)

Methods

async search_result() -> dict

Returns the full search results as a dictionary from the Brave Search API.

async urls() -> list[str]

Returns a list of URLs from the search results.

async detailed_search_result() -> list[dict]

Returns detailed information for each search result including title, URL, description, and metadata.

async print() -> None

Pretty prints the full search results.

@classmethod async close_session()

Closes the shared aiohttp session. Call this when done with all searches.

Authentication

Set your Brave Search API key as an environment variable:

# Linux/macOS
export BRAVE_API_KEY="your-api-key-here"

# Windows
set BRAVE_API_KEY=your-api-key-here

Or provide it directly:

search = Searcherator("My search term", api_key="your-api-key-here")

Exception Hierarchy

SearcheratorError (base exception)
├── SearcheratorAuthError (authentication failures)
├── SearcheratorRateLimitError (rate limit exceeded)
├── SearcheratorTimeoutError (request timeout)
└── SearcheratorAPIError (other API errors)

Rate Limiting

Searcherator automatically handles rate limiting to respect Brave Search API quotas:

  • Automatic throttling - Requests are automatically spaced to stay within limits
  • Concurrent control - Built-in semaphore limits concurrent requests
  • Rate limit tracking - Monitor your usage via instance attributes

The default configuration safely handles up to ~13 requests per second, well under typical API limits.

Caching

Results are automatically cached to disk:

  • Location: data/search/ directory
  • Format: JSON files
  • TTL: Configurable (default: 7 days)
  • Cache key: Based on search term, language, country, and num_results

To disable caching for a specific search:

search = Searcherator("Python", clear_cache=True, ttl=0)

Best Practices

  1. Always close the session when done:

    try:
        # Your searches
    finally:
        await Searcherator.close_session()
    
  2. Use batch processing for multiple searches:

    results = await asyncio.gather(*[s.search_result() for s in searches])
    
  3. Handle exceptions appropriately:

    try:
        results = await search.search_result()
    except SearcheratorRateLimitError:
        # Wait and retry
    
  4. Monitor rate limits for high-volume applications:

    if search.rate_remaining_per_month < 1000:
        # Alert or throttle
    

Testing

Run the test suite:

# Install test dependencies
pip install pytest pytest-asyncio

# Run all tests
pytest test_searcherator.py -v

# Run with coverage
pip install pytest-cov
pytest test_searcherator.py --cov=searcherator --cov-report=html

License

MIT License

Links

Author

Arved Klöhn - GitHub

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

searcherator-0.1.1.tar.gz (10.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

searcherator-0.1.1-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file searcherator-0.1.1.tar.gz.

File metadata

  • Download URL: searcherator-0.1.1.tar.gz
  • Upload date:
  • Size: 10.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for searcherator-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5d70a30463c2a8d409347d3a130597d283ee1044fbacaabfaadb9eb60711ecaa
MD5 a4a4cab9a566e21ade5d16137ab85eae
BLAKE2b-256 4a33ca8e22af94dc5146f845919895e8dde97011ea057210314d26c62f2be597

See more details on using hashes here.

File details

Details for the file searcherator-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: searcherator-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for searcherator-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dfee85f04742fe1fd3fc5f70138b8016eb598f5a33e7264f854b601f878a7e65
MD5 cfe293e62ad8fd21e4051eaf8179b8a9
BLAKE2b-256 5b8b3c09fb6a2918da0142a025fe0b7321706a175836eb9ce8cc1e83e5e5d326

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page