Skip to main content

Haystack integration for Serpex web search - supporting Google, Bing, DuckDuckGo, Brave, Yahoo, and Yandex

Project description

Serpex Haystack Integration

PyPI - Version PyPI - Python Version License CI Tests Code style: black

Serpex integration for Haystack - bringing powerful multi-engine web search capabilities to your Haystack pipelines.

Overview

Serpex is a unified web search API that provides access to multiple search engines including Google, Bing, DuckDuckGo, Brave, Yahoo, and Yandex. This integration allows you to seamlessly incorporate web search results into your Haystack RAG (Retrieval-Augmented Generation) pipelines and AI applications.

Key Features

  • 🔍 Multi-Engine Support: Switch between Google, Bing, DuckDuckGo, Brave, Yahoo, and Yandex
  • High Performance: Fast and reliable API with automatic retries
  • 🎯 Rich Results: Get organic search results with titles, snippets, and URLs
  • 🕒 Time Filters: Filter results by day, week, month, or year
  • 🔒 Type-Safe: Fully typed with comprehensive type hints
  • 📝 Haystack Native: Seamless integration with Haystack 2.0+ components

Installation

pip install serpex-haystack

Quick Start

Get Your API Key

Sign up at Serpex.dev to get your free API key.

Basic Usage

from haystack import Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
from haystack_integrations.components.websearch.serpex import SerpexWebSearch

# Create a web search component
web_search = SerpexWebSearch(
    api_key=Secret.from_env_var("SERPEX_API_KEY"),
    engine="google",  # or "bing", "duckduckgo", "brave", "yahoo", "yandex"
)

# Use it standalone
results = web_search.run(query="What is Haystack AI?")
for doc in results["documents"]:
    print(f"Title: {doc.meta['title']}")
    print(f"URL: {doc.meta['url']}")
    print(f"Snippet: {doc.content}\n")

RAG Pipeline Example

from haystack import Pipeline
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
from haystack_integrations.components.websearch.serpex import SerpexWebSearch

# Create a simple RAG pipeline with web search
prompt_template = """
Based on the following search results, answer the question.

Search Results:
{% for doc in documents %}
- {{ doc.meta.title }}: {{ doc.content }}
  Source: {{ doc.meta.url }}
{% endfor %}

Question: {{ query }}

Answer:
"""

pipe = Pipeline()
pipe.add_component("search", SerpexWebSearch(api_key=Secret.from_env_var("SERPEX_API_KEY")))
pipe.add_component("prompt", PromptBuilder(template=prompt_template))
pipe.add_component("llm", OpenAIGenerator(api_key=Secret.from_env_var("OPENAI_API_KEY")))

pipe.connect("search.documents", "prompt.documents")
pipe.connect("prompt", "llm")

# Run the pipeline
result = pipe.run({
    "search": {"query": "Latest developments in AI agents"},
    "prompt": {"query": "Latest developments in AI agents"}
})

print(result["llm"]["replies"][0])

Advanced Features

Multiple Search Engines

# Use different engines for different queries
google_search = SerpexWebSearch(engine="google")
bing_search = SerpexWebSearch(engine="bing")
duckduckgo_search = SerpexWebSearch(engine="duckduckgo")

Time Range Filtering

# Get only recent results
recent_results = web_search.run(
    query="AI news",
    time_range="week"  # Options: "day", "week", "month", "year", "all"
)

Runtime Configuration Override

# Override settings at runtime
results = web_search.run(
    query="Python tutorials",
    engine="duckduckgo",  # Override default engine
)

Error Handling with Retries

The component includes built-in retry logic with exponential backoff:

web_search = SerpexWebSearch(
    api_key=Secret.from_env_var("SERPEX_API_KEY"),
    timeout=10.0,  # Request timeout in seconds
    retry_attempts=3  # Number of retry attempts
)

Component Reference

SerpexWebSearch

A Haystack component for fetching web search results via the Serpex API.

Parameters

  • api_key (Secret, optional): Serpex API key. Defaults to SERPEX_API_KEY environment variable.
  • engine (str, optional): Search engine to use. Options: "auto", "google", "bing", "duckduckgo", "brave", "yahoo", "yandex". Defaults to "google".
  • timeout (float, optional): Request timeout in seconds. Defaults to 10.0.
  • retry_attempts (int, optional): Number of retry attempts. Defaults to 2.

Inputs

  • query (str): The search query string.
  • engine (str, optional): Override the default search engine.
  • time_range (str, optional): Filter by time range ("all", "day", "week", "month", "year").

Outputs

  • documents (List[Document]): List of Haystack Document objects containing search results.

Each document includes:

  • content: The search result snippet
  • meta:
    • title: Result title
    • url: Result URL
    • position: Position in search results
    • query: Original search query
    • engine: Search engine used

Examples

Check out the examples directory for more use cases:

Why Serpex?

  • 🌐 Multi-Engine Access: One API for all major search engines
  • ⚡ Fast & Reliable: Optimized infrastructure with 99.9% uptime
  • 💰 Cost-Effective: Competitive pricing with generous free tier
  • 📊 Rich Metadata: Comprehensive result data including positions, timestamps, and more
  • 🔒 Secure: Enterprise-grade security and data privacy
  • 🚀 Scalable: Handle thousands of requests per second

Documentation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development Setup

# Clone the repository
git clone https://github.com/divyeshradadiya/serpex-haystack.git
cd serpex-haystack

# Install with development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run linting
ruff check .
black --check .

# Run type checking
mypy src/

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Support

Acknowledgments

Built with ❤️ for the Haystack community by Divyesh Radadiya


Note: This is a community-maintained integration. For Serpex API support, visit serpex.dev.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

serpex_haystack-1.0.0.tar.gz (11.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

serpex_haystack-1.0.0-py3-none-any.whl (11.9 kB view details)

Uploaded Python 3

File details

Details for the file serpex_haystack-1.0.0.tar.gz.

File metadata

  • Download URL: serpex_haystack-1.0.0.tar.gz
  • Upload date:
  • Size: 11.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for serpex_haystack-1.0.0.tar.gz
Algorithm Hash digest
SHA256 037e58e8f470f2602aaa3f8e6237cf84622ad59d8bfa56c7d88d466e417aa522
MD5 ba522985d095560f2d6d30c71eccf4dc
BLAKE2b-256 8b0cab242e0282f02f787b9b3fe24b9cc5dd2fa7c84283515251b7eeffda7ed1

See more details on using hashes here.

File details

Details for the file serpex_haystack-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for serpex_haystack-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 de987d036eb4ed2dabfc42e764971cfc3883cbf573b56bd3966c5aab1ce32b3f
MD5 86f51abcd5b5dcd26ae7cce7f5469669
BLAKE2b-256 5b77d3d43782106611220470a52554b261b510f7c17932cc75e676e191a9c85a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page