Skip to main content

No project description provided

Project description

Unique Web Search

A powerful, configurable web search tool for retrieving and processing the latest information from the internet. This package provides intelligent search capabilities with support for multiple search engines, web crawlers, and content processing strategies.

Architecture

The following diagram illustrates the complete architecture and workflow of the unique_web_search package:

Web Search Tool Architecture

Key Features

  • Dual Execution Modes:

    • V1 (Traditional): Query refinement with single or multiple search strategies
    • V2 (Step-based Planning): Advanced research planning with parallel execution
  • Multiple Search Engines:

    • Google Search
    • Bing Search
    • Brave Search
    • Jina Search
    • Tavily Search
    • Firecrawl Search
    • VertexAI (Gemini with Grounding)
    • Custom API (integrate any compatible web search API)
  • Multiple Web Crawlers:

    • Basic HTTP Crawler
    • Crawl4AI
    • Jina Reader
    • Tavily Crawler
    • Firecrawl Crawler
  • Intelligent Content Processing:

    • LLM-based summarization
    • Token-based truncation
    • Relevancy scoring and sorting
    • Content chunking and optimization
  • Query Refinement:

    • BASIC Mode: Single optimized search query
    • ADVANCED Mode: Multiple targeted search queries for complex research
  • Performance Optimized:

    • Parallel execution of search and crawl operations
    • Token limit management
    • Configurable timeouts and error handling

Detailed Subsystem Docs

For deeper dives into each subsystem, see the dedicated READMEs:

  • Search Engines — full catalogue of supported engines, configuration, and usage examples.
  • Crawlers — comparison of crawling strategies (Basic, Crawl4AI, Tavily, Firecrawl, Jina) with setup guides.
  • Executors — orchestration layer (V1 & V2) covering query refinement, planning, logging, and best practices.

Configuration

The tool uses environment variables and configuration files to manage API keys and settings. Key configuration areas include:

  • Search engine selection and API keys
  • Crawler selection and configuration
  • Content processing strategies (SUMMARIZE, TRUNCATE, NONE)
  • Token limits and relevancy thresholds
  • Proxy configuration
  • Debug and monitoring options

Dependency management (uv.lock + min/latest testing)

This package is a library and uses uv for dependency management.

We run tests additionally with minimal dependencies to ensure that the listed ranges are valid. NOTE: We use lowest-direct, not lowest. Lowest attempts to use the lowest possible dependency versions tarnsitively causing issues if a dependency has incorrect metadata. Example:

  • google-cloud-aiplatform says it works with shapely<3.0.0.
  • The lowest resolver assumes 1.0 which needs python 2 -> breaks Therefore we use lowest-direct which only sets our direct dependencies to lowest. However, this only correctly verifies our min dependencies if our code correctly lists all the required dependencies and never imports a transitive dependency. We therefore use deptry to ansure we don't use transitive dependencies and that we have no unused dependencies.

Test locally

  • Latest deps and deptry:
cd tool_packages/unique_web_search
uv sync
uv run pytest
uv run deptry
  • Min deps:
cd tool_packages/unique_web_search
uv venv
# Install runtime deps at minimum versions
uv pip install -e . --resolution=lowest-direct
# Install dev deps from [dependency-groups] (we only care about runtime dep minimums)
uv export --only-group dev --no-hashes | uv pip install -r -
# Use --no-sync to prevent uv from "fixing" the versions
uv run --no-sync pytest

Workflow

  1. Input: User query or structured search plan
  2. Configuration: Load settings and initialize services
  3. Execution:
    • V1: Query refinement → Search → Crawl → Process
    • V2: Execute planned steps in parallel → Process
  4. Content Processing: Clean, summarize/truncate, and chunk content
  5. Optimization: Reduce to token limits and sort by relevance
  6. Output: Return structured content chunks optimized for LLM consumption

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unique_web_search-2026.20.0.dev13.tar.gz (113.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unique_web_search-2026.20.0.dev13-py3-none-any.whl (173.8 kB view details)

Uploaded Python 3

File details

Details for the file unique_web_search-2026.20.0.dev13.tar.gz.

File metadata

  • Download URL: unique_web_search-2026.20.0.dev13.tar.gz
  • Upload date:
  • Size: 113.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for unique_web_search-2026.20.0.dev13.tar.gz
Algorithm Hash digest
SHA256 d4a515f902de6d9ce8b0f4674aa15d8ee2b66e768b52f52e90611e767d49914a
MD5 94217bbfab8c9282532989bfbf6f6101
BLAKE2b-256 44ed0023459be970ae1730f6a9416e37884cf72f66cf699d179e76cc1252eafc

See more details on using hashes here.

File details

Details for the file unique_web_search-2026.20.0.dev13-py3-none-any.whl.

File metadata

  • Download URL: unique_web_search-2026.20.0.dev13-py3-none-any.whl
  • Upload date:
  • Size: 173.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.11.5 {"installer":{"name":"uv","version":"0.11.5","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for unique_web_search-2026.20.0.dev13-py3-none-any.whl
Algorithm Hash digest
SHA256 0f9fbd2e875f5cda088044f658d4306d0952f33887554e2d1fac0fd758e4c233
MD5 4a3ae0575d7b1313ebc08ebaf30e1298
BLAKE2b-256 839a7b3bc051c0fb0ac4301c2d36d747f23479f5725a290a4fa2e16c5053e691

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page