An intelligent web search agent that provides structured results
Project description
Web Search Agent
A modular Python package for intelligent web search that processes search results into structured, section-based information.
Overview
Web Search Agent is an infrastructure tool that performs web searches based on a given topic, processes the results using AI, and returns structured information organized into relevant sections. It helps users/developers quickly gather and organize comprehensive information on specific topics.
Features
- Topic-based intelligent web searching
- AI-powered query generation for comprehensive coverage
- Automatic organization of search results into logical sections
- Section-specific sub-queries for in-depth analysis
- Structured JSON output for easy processing
- Configurable search parameters and model settings
- Support for multiple search APIs (Tavily, Exa)
Installation
Prerequisites
- Python 3.11 or higher
- OpenAI API key
- Search API key (Tavily, Exa, etc.)
Installation Steps
# Create and activate a virtual environment (optional but recommended)
conda create -n web-search-agent python=3.11.11
conda activate web-search-agent
# Install from PyPI
pip install web-search-agent
# Or install from source
git clone https://github.com/leepokai/web-search-agent.git
cd web-search-agent
pip install -e .
# Set up environment variables
cp .env.example .env
# Edit .env file to add your API keys
Usage
Basic Usage
import asyncio
from web_search_agent import WebSearchAgent, WebSearchConfig
async def main():
# Create a default agent
agent = WebSearchAgent()
# Search for a topic
result = await agent.search("Artificial Intelligence ethics")
# Process the results
print(f"Topic: {result.title}")
print(f"Found {len(result.sections)} sections")
for section in result.sections:
print(f"- {section.title}: {section.description[:100]}...")
# Run the async function
asyncio.run(main())
Advanced Usage
import asyncio
from web_search_agent import WebSearchAgent, WebSearchConfig, search_multiple_topics
async def research_project():
# Custom configuration
config = WebSearchConfig(
planner_model="o1",
initial_queries_count=3,
max_sections=5,
search_api="tavily",
search_api_config={"include_raw_content": True, "max_results": 3}
)
# Search multiple topics
topics = [
"Renewable energy advancements",
"Future of remote work"
]
results = await search_multiple_topics(
topics,
config=config,
verbose=True,
save_output=True,
output_dir="research_output"
)
# Process results
for result in results:
if hasattr(result, 'title'): # Check if valid result
print(f"\nTopic: {result.title}")
for section in result.sections:
print(f"- {section.title}")
# Run the async function
if __name__ == "__main__":
asyncio.run(research_project())
Command Line Usage
# Basic usage
web-search-agent
API Reference
Main Classes
WebSearchAgent: Core class for performing searches and processing resultsWebSearchConfig: Configuration class for customizing search behaviorWebSearchResult: Result container with all search data and sections
Key Functions
search_topic(topic, config=None, verbose=False): Search for a single topicsearch_multiple_topics(topics, config=None, verbose=False, save_output=False): Search for multiple topics
Configuration Options
The WebSearchConfig class provides the following configuration options:
WebSearchConfig(
# LLM Configuration
llm_provider="openai", # LLM provider (currently supports "openai")
planner_model="o1", # Model for planning and query generation
# Search Configuration
search_api="tavily", # Search API (tavily, exa, etc.)
search_api_config={ # Additional search API configuration
"include_raw_content": True,
"max_results": 3
},
# Query Generation
initial_queries_count=3, # Number of initial search queries
section_queries_count=2, # Number of search queries per section
# Control Parameters
max_sections=5, # Maximum number of sections to generate
must_cover_section_title="..." # Prompt for must-cover sections
)
Output Format
The tool outputs structured data with the following main fields:
title: Main search topicinitial_queries: List of initial search queriesinitial_responses: Search results for each initial querysections: List of generated sections- Each section contains: title, description, search queries and responses
Example Output
{
"title": "Artificial Intelligence Ethics",
"initial_queries": [...],
"initial_responses": [...],
"sections": [
{
"title": "Privacy concerns in AI systems",
"description": "An analysis of how AI technologies impact personal privacy...",
"search_queries": [...],
"search_responses": [...]
},
...
]
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file web_search_agent-0.2.0.tar.gz.
File metadata
- Download URL: web_search_agent-0.2.0.tar.gz
- Upload date:
- Size: 14.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a91b8b3cd60ccd58ab34634082ecab9dd0e4cd86badd8d1dc0134278c5c6160f
|
|
| MD5 |
17e2b7158a28261aaf1fd3fe3927f661
|
|
| BLAKE2b-256 |
3e8cbc7a490d4e263eaf407310809d90b15dfbe40d79c40a671fb8eb7382240b
|
File details
Details for the file web_search_agent-0.2.0-py3-none-any.whl.
File metadata
- Download URL: web_search_agent-0.2.0-py3-none-any.whl
- Upload date:
- Size: 14.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.11.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa50e4065a1ade4112abec7c5d68ca658ae1ebf379b0978475ed1f62eec81b29
|
|
| MD5 |
299d2ea60a2d85329a2a094a2ba19cb2
|
|
| BLAKE2b-256 |
39157238cf571d3ddcf4a8978722a25dab245663e2b8b8c9d2c99e244c33532e
|