Skip to main content

Intelligent multi-provider search API with automatic fallback and caching

Project description

Multi-Search-API

Intelligent multi-provider search API with automatic fallback and caching

PyPI version Python Support License: MIT

Features

  • 🔄 Automatic Fallback: Seamlessly switches between multiple search providers
  • 💾 Smart Caching: 1-day result caching to reduce API calls
  • 🚦 Rate Limit Handling: Automatic detection and provider rotation on HTTP 402/429
  • 🔌 Multiple Providers: Support for Serper, SearXNG, Brave, DuckDuckGo, and Google scraping
  • 🎯 Zero Configuration: Works out of the box with sensible defaults
  • 📊 Provider Management: Track status, cache stats, and rate limits

Supported Search Providers

Provider Type Quality Rate Limits API Key Required
Serper API ⭐⭐⭐⭐⭐ Excellent 2,500 free/month Yes
SearXNG Meta-search ⭐⭐⭐⭐ Good Unlimited No
Brave API ⭐⭐⭐⭐⭐ Excellent 1 req/sec free Yes
DuckDuckGo Scraping ⭐⭐⭐⭐ Good ~20 req/min No
Google Scraper Scraping ⭐⭐⭐ Fair Use sparingly No

Installation

pip install multi-search-api

Quick Start

Basic Usage

from multi_search_api import SmartSearchTool

# Initialize (uses environment variables for API keys)
search = SmartSearchTool()

# Perform a search
result = search.search("Python programming tutorials")

print(f"Provider used: {result['provider']}")
print(f"Results found: {len(result['results'])}")

for item in result['results'][:3]:
    print(f"\n{item['title']}")
    print(f"{item['snippet']}")
    print(f"{item['link']}")

With API Keys

from multi_search_api import SmartSearchTool

# Initialize with explicit API keys
search = SmartSearchTool(
    serper_api_key="your-serper-key",
    brave_api_key="your-brave-key"
)

result = search.search("AI news 2025", num_results=10)

Environment Variables

Create a .env file:

SERPER_API_KEY=your_serper_api_key_here
BRAVE_API_KEY=your_brave_api_key_here

The tool will automatically load these keys.

Advanced Usage

Recent Content Search

import asyncio
from multi_search_api import SmartSearchTool

async def search_recent():
    search = SmartSearchTool()

    # Search for content from last 14 days
    results = await search.search_recent_content(
        query="AI breakthroughs",
        max_results=10,
        days_back=14,
        language="en"
    )

    return results

results = asyncio.run(search_recent())

Cache Management

search = SmartSearchTool()

# Get cache statistics
stats = search.get_status()
print(f"Cache entries: {stats['cache']['total_entries']}")

# Clear expired cache entries
search.clear_cache()

# Disable caching
search.disable_cache()

# Re-enable caching
search.enable_cache()

Rate Limit Management

search = SmartSearchTool()

# Check provider status
status = search.get_status()
print(f"Active providers: {status['providers']}")
print(f"Rate limited: {status['rate_limited_providers']}")

# Reset rate limit tracking (e.g., new day)
search.reset_rate_limits()

CrewAI Integration

from crewai import Agent, Task
from multi_search_api import SmartSearchTool

search_tool = SmartSearchTool()

researcher = Agent(
    role='Research Analyst',
    goal='Find relevant information on the web',
    tools=[search_tool],
    verbose=True
)

task = Task(
    description="Research the latest AI developments",
    agent=researcher
)

How It Works

Provider Priority

  1. Serper - Best quality results, 2,500 free searches/month
  2. SearXNG - Free unlimited searches, variable quality
  3. Brave - Excellent quality, 1 req/sec limit on free tier
  4. DuckDuckGo - Free, no API key, ~20 req/min with exponential backoff
  5. Google Scraper - Last resort fallback

Automatic Fallback

When a provider fails or hits rate limits (HTTP 402/429), the tool automatically:

  1. Detects the failure
  2. Marks the provider as rate-limited for the session
  3. Tries the next available provider
  4. Caches successful results to minimize future API calls

Caching Strategy

  • Results are cached for 24 hours
  • Cache keys based on: query, num_results, language
  • Automatic cleanup of expired entries
  • Optional cache disable for real-time needs

API Reference

SmartSearchTool

SmartSearchTool(
    ollama_api_key: str | None = None,
    serper_api_key: str | None = None,
    brave_api_key: str | None = None,
    searxng_instance: str | None = None,
    enable_cache: bool = True
)

Methods

  • search(query: str, **kwargs) -> dict: Perform a search
  • search_recent_content(query: str, max_results: int, days_back: int, language: str) -> list: Search recent content
  • get_status() -> dict: Get provider and cache status
  • clear_cache(): Clear expired cache entries
  • reset_rate_limits(): Reset rate limit tracking
  • disable_cache(): Disable caching
  • enable_cache(): Enable caching
  • run(query: str) -> str: CrewAI-compatible search method

Search Result Format

{
    "query": "search query",
    "provider": "SerperProvider",
    "cache_hit": False,
    "timestamp": "2025-10-26T10:30:00",
    "results": [
        {
            "title": "Result Title",
            "snippet": "Result description or snippet",
            "link": "https://example.com",
            "source": "serper"
        },
        # ... more results
    ]
}

Getting API Keys

Serper (Recommended)

  1. Visit serper.dev
  2. Sign up for free account
  3. Get 2,500 free searches per month
  4. Copy your API key

Brave Search

  1. Visit brave.com/search/api
  2. Sign up for API access
  3. Free tier: 1 request/second
  4. Copy your subscription token

SearXNG (No Key Needed)

SearXNG is automatically configured with public instances. No setup required!

DuckDuckGo (No Key Needed)

DuckDuckGo is included by default. No setup required!

Features:

  • No API key required
  • Automatic rate limiting (~20 requests/minute)
  • Exponential backoff on rate limit errors

Configuration

Custom Cache Directory

from multi_search_api.cache import SearchResultCache

cache = SearchResultCache(cache_file="custom/path/cache.json")

Custom SearXNG Instance

search = SmartSearchTool(searxng_instance="https://your-searxng.com")

Development

# Clone repository
git clone https://github.com/joop/multi-search-api.git
cd multi-search-api

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=multi_search_api --cov-report=html

# Format code
ruff format .

# Lint code
ruff check .

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Author

Joop Snijder

Changelog

0.1.5 (2025-12-03)

  • Improved SearXNG rate limit handling with instance cooldown (5 min)
  • Rate-limited SearXNG instances are now tracked and skipped
  • Raises RateLimitError when all SearXNG instances are rate-limited

0.1.4 (2025-12-03)

  • Updated DuckDuckGo dependency from duckduckgo-search to ddgs (package renamed)

0.1.3 (2025-12-03)

  • DuckDuckGo is now a standard dependency (no longer optional)

0.1.2 (2025-12-03)

  • Added DuckDuckGo search provider (free, no API key)
  • Exponential backoff rate limiting for DuckDuckGo

0.1.1 (2025-11-03)

  • Fixed thread-safety issues in SearchResultCache
  • Added threading.Lock for concurrent cache operations
  • Comprehensive thread-safety tests

0.1.0 (2025-10-26)

  • Initial release
  • Support for Serper, SearXNG, Brave, and Google scraping
  • Automatic fallback and rate limit handling
  • 24-hour result caching
  • CrewAI integration support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multi_search_api-0.1.5.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multi_search_api-0.1.5-py3-none-any.whl (22.6 kB view details)

Uploaded Python 3

File details

Details for the file multi_search_api-0.1.5.tar.gz.

File metadata

  • Download URL: multi_search_api-0.1.5.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for multi_search_api-0.1.5.tar.gz
Algorithm Hash digest
SHA256 ddd59ec7f0422ce6c582f5c38c820aec9c866047920ce566e47c1ca717f69a71
MD5 ef2164d2a392dc09dfac1d6246f760f2
BLAKE2b-256 8d947a75ab5094fed53b5b205a5841f3e763b89cfae05b9c3e775c798c3c2d3c

See more details on using hashes here.

File details

Details for the file multi_search_api-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for multi_search_api-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 631cbedef09cebd99d9180cf32eb97faa248d5fcaab95eed47a8e21126c98d4c
MD5 0e40c0bb94c4294aa766fcf5fb9fb04c
BLAKE2b-256 547a2365f031097c4ac952b9925653f811b5bb43ab6d2c5312f405d808da0c26

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page