Skip to main content

An intelligent web search agent that provides structured results

Project description

Web Search Agent

A modular Python package for intelligent web search that processes search results into structured, section-based information.

Overview

Web Search Agent is an infrastructure tool that performs web searches based on a given topic, processes the results using AI, and returns structured information organized into relevant sections. It helps users/developers quickly gather and organize comprehensive information on specific topics.

Features

  • Topic-based intelligent web searching
  • AI-powered query generation for comprehensive coverage
  • Automatic organization of search results into logical sections
  • Section-specific sub-queries for in-depth analysis
  • Structured JSON output for easy processing
  • Configurable search parameters and model settings
  • Support for multiple search APIs (Tavily, Exa)

Installation

Prerequisites

  • Python 3.11 or higher
  • OpenAI API key
  • Search API key (Tavily, Exa, etc.)

Installation Steps

# Create and activate a virtual environment (optional but recommended)
conda create -n web-search-agent python=3.11.11
conda activate web-search-agent

# Install from PyPI
pip install web-search-agent

# Or install from source
git clone https://github.com/leepokai/web-search-agent.git
cd web-search-agent
pip install -e .

# Set up environment variables
cp .env.example .env
# Edit .env file to add your API keys

Usage

Basic Usage

import asyncio
from web_search_agent import WebSearchAgent, WebSearchConfig

async def main():
    # Create a default agent
    agent = WebSearchAgent()
    
    # Search for a topic
    result = await agent.search("Artificial Intelligence ethics")
    
    # Process the results
    print(f"Topic: {result.title}")
    print(f"Found {len(result.sections)} sections")
    
    for section in result.sections:
        print(f"- {section.title}: {section.description[:100]}...")

# Run the async function
asyncio.run(main())

Advanced Usage

import asyncio
from web_search_agent import WebSearchAgent, WebSearchConfig, search_multiple_topics

async def research_project():
    # Custom configuration
    config = WebSearchConfig(
        planner_model="o1",
        initial_queries_count=3,
        max_sections=5,
        search_api="tavily",
        search_api_config={"include_raw_content": True, "max_results": 3}
    )
    
    # Search multiple topics
    topics = [
        "Renewable energy advancements",
        "Future of remote work"
    ]
    
    results = await search_multiple_topics(
        topics,
        config=config,
        verbose=True,
        save_output=True,
        output_dir="research_output"
    )
    
    # Process results
    for result in results:
        if hasattr(result, 'title'):  # Check if valid result
            print(f"\nTopic: {result.title}")
            for section in result.sections:
                print(f"- {section.title}")

# Run the async function
if __name__ == "__main__":
    asyncio.run(research_project())

Command Line Usage

# Basic usage
web-search-agent

API Reference

Main Classes

  • WebSearchAgent: Core class for performing searches and processing results
  • WebSearchConfig: Configuration class for customizing search behavior
  • WebSearchResult: Result container with all search data and sections

Key Functions

  • search_topic(topic, config=None, verbose=False): Search for a single topic
  • search_multiple_topics(topics, config=None, verbose=False, save_output=False): Search for multiple topics

Configuration Options

The WebSearchConfig class provides the following configuration options:

WebSearchConfig(
    # LLM Configuration
    llm_provider="openai",         # LLM provider (currently supports "openai")
    planner_model="o1",            # Model for planning and query generation
    
    # Search Configuration
    search_api="tavily",           # Search API (tavily, exa, etc.)
    search_api_config={            # Additional search API configuration
        "include_raw_content": True,
        "max_results": 3
    },
    
    # Query Generation
    initial_queries_count=3,       # Number of initial search queries
    section_queries_count=2,       # Number of search queries per section
    
    # Control Parameters
    max_sections=5,                # Maximum number of sections to generate
    must_cover_section_title="..." # Prompt for must-cover sections
)

Output Format

The tool outputs structured data with the following main fields:

  • title: Main search topic
  • initial_queries: List of initial search queries
  • initial_responses: Search results for each initial query
  • sections: List of generated sections
    • Each section contains: title, description, search queries and responses

Example Output

{
  "title": "Artificial Intelligence Ethics",
  "initial_queries": [...],
  "initial_responses": [...],
  "sections": [
    {
      "title": "Privacy concerns in AI systems",
      "description": "An analysis of how AI technologies impact personal privacy...",
      "search_queries": [...],
      "search_responses": [...]
    },
    ...
  ]
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

web_search_agent-0.2.0.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

web_search_agent-0.2.0-py3-none-any.whl (14.1 kB view details)

Uploaded Python 3

File details

Details for the file web_search_agent-0.2.0.tar.gz.

File metadata

  • Download URL: web_search_agent-0.2.0.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.11

File hashes

Hashes for web_search_agent-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a91b8b3cd60ccd58ab34634082ecab9dd0e4cd86badd8d1dc0134278c5c6160f
MD5 17e2b7158a28261aaf1fd3fe3927f661
BLAKE2b-256 3e8cbc7a490d4e263eaf407310809d90b15dfbe40d79c40a671fb8eb7382240b

See more details on using hashes here.

File details

Details for the file web_search_agent-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for web_search_agent-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fa50e4065a1ade4112abec7c5d68ca658ae1ebf379b0978475ed1f62eec81b29
MD5 299d2ea60a2d85329a2a094a2ba19cb2
BLAKE2b-256 39157238cf571d3ddcf4a8978722a25dab245663e2b8b8c9d2c99e244c33532e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page