No project description provided
Project description
Unique Web Search
A powerful, configurable web search tool for retrieving and processing the latest information from the internet. This package provides intelligent search capabilities with support for multiple search engines, web crawlers, and content processing strategies.
Architecture
The following diagram illustrates the complete architecture and workflow of the unique_web_search package:
Key Features
-
Dual Execution Modes:
- V1 (Traditional): Query refinement with single or multiple search strategies
- V2 (Step-based Planning): Advanced research planning with parallel execution
-
Multiple Search Engines:
- Google Search
- Bing Search
- Brave Search
- Jina Search
- Tavily Search
- Firecrawl Search
- VertexAI (Gemini with Grounding)
- Custom API (integrate any compatible web search API)
-
Multiple Web Crawlers:
- Basic HTTP Crawler
- Crawl4AI
- Jina Reader
- Tavily Crawler
- Firecrawl Crawler
-
Intelligent Content Processing:
- LLM-based summarization
- Token-based truncation
- Relevancy scoring and sorting
- Content chunking and optimization
-
Query Refinement:
- BASIC Mode: Single optimized search query
- ADVANCED Mode: Multiple targeted search queries for complex research
-
Performance Optimized:
- Parallel execution of search and crawl operations
- Token limit management
- Configurable timeouts and error handling
Detailed Subsystem Docs
For deeper dives into each subsystem, see the dedicated READMEs:
- Search Engines — full catalogue of supported engines, configuration, and usage examples.
- Crawlers — comparison of crawling strategies (Basic, Crawl4AI, Tavily, Firecrawl, Jina) with setup guides.
- Executors — orchestration layer (V1 & V2) covering query refinement, planning, logging, and best practices.
Configuration
The tool uses environment variables and configuration files to manage API keys and settings. Key configuration areas include:
- Search engine selection and API keys
- Crawler selection and configuration
- Content processing strategies (SUMMARIZE, TRUNCATE, NONE)
- Token limits and relevancy thresholds
- Proxy configuration
- Debug and monitoring options
Workflow
- Input: User query or structured search plan
- Configuration: Load settings and initialize services
- Execution:
- V1: Query refinement → Search → Crawl → Process
- V2: Execute planned steps in parallel → Process
- Content Processing: Clean, summarize/truncate, and chunk content
- Optimization: Reduce to token limits and sort by relevance
- Output: Return structured content chunks optimized for LLM consumption
Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[1.7.0] - 2025-12-01
- Added full VertexAI search engine integration (Gemini + Google grounding) with service-account authentication and redirect resolution.
- Introduced the pluggable Custom API search engine so customers can register any compliant web-search backend via simple GET/POST specs.
[1.6.1] - 2025-11-20
- Cleaner call of tool name with display name in logger tool.
[1.6.0] - 2025-11-20
- Include message log messages
[1.5.4] - 2025-11-12
- Move pytest and pytest-asyncio to dev dependencies
[1.5.3] - 2025-11-10
- Use SkipJsonSchema for mode under WebSearchMode config to prevent displaying an editable field
[1.5.2] - 2025-11-10
- Separate the configuration of modes to prevent breaking the frontend
[1.5.1] - 2025-11-10
- Flag V2 mode and Advanced Query Refinement as Beta
[1.5.0] - 2025-11-10
- Add support for private endpoint transport (for Workload identity authentication)
[1.4.0] - 2025-11-10
- Expose Search Mode Configuration
[1.3.6] - 2025-10-29
- Fix minor notification display issue and remove unnecssary log
[1.3.5] - 2025-10-29
- Upgrading azure-ai-projects to 1.0.0 version (relevant for bing search)
[1.3.4] - 2025-10-28
- Removing unused tool specific
get_tool_call_result_for_loop_historyfunction
## [1.3.3] - 2025-10-14
- Fix bug in selecting the refine query mode
[1.3.2] - 2025-10-10
- Add possibility to switch proxy auth protocol (http or https)
[1.3.1] - 2025-10-09
- Update loading path of
DEFAULT_GPT_4ofromunique_toolkit
[1.3.0] - 2025-10-06
- Proxy Authentication Support: Route search engine and crawler requests through proxies with multiple authentication methods:
- Username/Password authentication
- Client Certificate authentication
- Active Crawlers: Dynamic crawler activation system allowing selective enablement of crawling services:
- In-house crawlers: Control activation via environment variables for internal crawlers (Basic, Crawl4AI.)
- External crawlers: Auto-activate when API keys are configured (Firecrawl, Jina, Tavily)
- Test Coverage: Added comprehensive tests to ensure web search tool stability and reliability
[1.2.0] - 2025-09-29
- Mark new crawlers as experimental
[1.1.0] - 2025-09-24
- Set active search engine through
active_search_enginesenv variable
[1.0.3] - 2025-09-23
- Add field to track execution time of the excutors
[1.0.2] - 2025-09-23
- Paralellize steps execution for V2 mode.
[1.0.1] - 2025-09-23
- Add octet-stream to blacklisted content-types and allow to change the unwanted-types from config
[1.0.0] - 2025-09-18
- Bump toolkit version to allow for both patch and minor updates
[0.2.0] - 2025-09-17
- Add support for Brave and Grounding by Bing through azure
[0.1.4] - 2025-09-17
- Updated to latest toolkit
[0.1.3] - 2025-09-17
- Add content utf8 cleanup logic when processing content
[0.1.2] - 2025-09-15
- Fix Minor bug in transforming toolResponse to toolCallResult
[0.1.1] - 2025-09-15
Added
- WebSearchV2Executor: New step-based execution model supporting both search and direct URL reading operations
- BaseWebSearchExecutor: Abstract base class providing common functionality between executor versions
- Enhanced Schema: New model
WebSearchPlanfor structured web search planning - Flexible Step Execution: Support for mixed search and URL reading operations in a single plan
Changed
- Architecture Refactor: Improved executor structure with better separation of concerns
- Configuration Enhancement: Added experimental features flag to switch between V1 and V2 modes
- Progress Reporting: Enhanced with step-specific notifications and better user feedback
Maintained
- Backward Compatibility: Existing V1 executor functionality preserved
- API Consistency: No breaking changes to existing tool interfaces
[0.1.0] - 2025-09-12
- Code simplification
- Enable new crawlers
- Default cleaning of search results
- Refactor of code structure and crawler location
[0.0.6] - 2025-09-05
- Updated unique_web_search README.
[0.0.5] - 2025-09-04
- Path change of loading local .env.
[0.0.4] - 2025-09-01
- Reduce default crawler timeout to 10s.
[0.0.3] - 2025-08-18
- Auto-register Tool in Factory.
[0.0.2] - 2025-08-18
- Moved out of private repo to public repo.
[0.0.1] - 2025-08-18
- Initial release of
web_search.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file unique_web_search-1.7.0.tar.gz.
File metadata
- Download URL: unique_web_search-1.7.0.tar.gz
- Upload date:
- Size: 62.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c38a6a0edb5b44279b2562963da1a14e23e6af91c5669552454ad59143c7c122
|
|
| MD5 |
c75f8338af8a0d1d467e2e55fd475197
|
|
| BLAKE2b-256 |
0510d20b38873e6669351db901b76dad3b1a87bda4ea06e376ce8b22fa70b8d9
|
File details
Details for the file unique_web_search-1.7.0-py3-none-any.whl.
File metadata
- Download URL: unique_web_search-1.7.0-py3-none-any.whl
- Upload date:
- Size: 85.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.3 Linux/6.11.0-1018-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
506f17fcbdb6892249af7036e0253d429442d1ce723d6e04e2a1cfbcc5b03023
|
|
| MD5 |
48080facdb46ff576240ff4d604699c1
|
|
| BLAKE2b-256 |
98835a69bd0bc599d178556cae94f18a10584d088b6e3de2220f7896d8861ea0
|