Skip to main content

Agent-friendly local web search using SearXNG for snippets and Crawl4AI for full-text retrieval.

Project description

Local Web Search

Local Web Search is an agent-friendly local web search backend. It gives agents two simple tools:

  • web_search asks a local SearXNG instance for search results and returns compact snippets.
  • web_fetch uses Crawl4AI to fetch full page text only when the agent asks for a specific result.

SQLite caching keeps result IDs and fetched page text stable across tool calls. The tool names intentionally look like normal web tools, but all execution happens in your local environment.

Install

python -m pip install local-web-search

Install optional integrations as needed:

python -m pip install "local-web-search[server]"
python -m pip install "local-web-search[agents]"
python -m pip install "local-web-search[server,agents]"

The Python import package is local_agentic_search:

from local_agentic_search import LocalSearchService

Quick Start

Clone the repository if you want the bundled Docker Compose SearXNG setup:

git clone https://github.com/maestromaximo/local-web-search.git
cd local-web-search
docker compose up -d searxng
python -m pip install -e ".[server,agents,dev]"
python -m crawl4ai-setup

Check that SearXNG is reachable:

local-web-search doctor

Search from the CLI:

local-web-search search "OpenAI Agents SDK function tools" --max-results 5

Fetch full text for a result returned by search:

local-web-search fetch res_... --max-chars 4000

Fetch a URL directly:

local-web-search fetch https://example.com --max-chars 4000

Run the HTTP API:

local-web-search serve --host 127.0.0.1 --port 8099

CLI

local-web-search doctor
local-web-search search "query" [--max-results 5] [--language en]
local-web-search fetch res_... [--start 0] [--max-chars 4000]
local-web-search fetch https://example.com [--max-chars 4000]
local-web-search serve [--host 127.0.0.1] [--port 8099]

Run local-web-search --help or local-web-search <command> --help for the full command reference.

OpenAI Agents SDK Usage

from agents import Agent, Runner
from local_agentic_search.agent_tools import build_agent_tools

web_search, web_fetch = build_agent_tools(build_container_if_missing=True)

agent = Agent(
    name="Research assistant",
    instructions=(
        "Use web_search when current web information is useful. Search results "
        "are snippets. Call web_fetch with a result_id before relying on page "
        "details not present in a snippet."
    ),
    model="gpt-4.1-mini",
    tools=[web_search, web_fetch],
)

result = Runner.run_sync(agent, "Find recent information about Crawl4AI.")
print(result.final_output)

By default, build_agent_tools() assumes Docker/SearXNG is already running and prints a yellow console warning once per process. To let the tool factory start SearXNG when the container is missing or stopped:

web_search, web_fetch = build_agent_tools(build_container_if_missing=True)

To silence the warning while keeping startup manual:

web_search, web_fetch = build_agent_tools(suppress_docker_warning=True)

Responses API Tool Schemas

For direct OpenAI API tool loops, use:

from local_agentic_search.tool_schemas import responses_tool_schemas

tools = responses_tool_schemas()

Your application still executes web_search and web_fetch locally and returns their JSON outputs as function call outputs.

HTTP API

Start the server with:

local-web-search serve

Routes:

  • GET /health
  • GET /search?q=...&max_results=5
  • POST /search
  • POST /fetch
  • GET /openai/tools

See examples/http_client_example.py for a minimal async HTTP client.

Examples

  • examples/agents_sdk_example.py: build OpenAI Agents SDK tools named web_search and web_fetch.
  • examples/http_client_example.py: call the local HTTP API with httpx.
  • examples/responses_schema_example.py: print the Responses API tool schemas.

Search Result Shape

Each web_search result includes:

{
  "result_id": "res_...",
  "search_id": "search_...",
  "position": 1,
  "title": "Page title",
  "url": "https://example.com",
  "snippet": "Compact SearXNG snippet",
  "site_links": [],
  "full_text_available": true,
  "full_text_command": {
    "tool": "web_fetch",
    "arguments": {
      "result_id": "res_...",
      "start": 0,
      "max_chars": 4000
    }
  },
  "full_text_command_text": "web_fetch(result_id='res_...', start=0, max_chars=4000)"
}

web_fetch returns a page slice with start, end, total_chars, has_more, and a next_fetch_command when more text is available.

Configuration

Environment variables:

  • SEARXNG_BASE_URL: defaults to http://127.0.0.1:8888
  • LOCAL_WEB_SEARCH_CACHE: defaults to .cache/local_agentic_search.sqlite3
  • LOCAL_WEB_SEARCH_RESULTS_TTL_SECONDS: defaults to 86400
  • LOCAL_WEB_SEARCH_PAGES_TTL_SECONDS: defaults to 604800
  • LOCAL_WEB_SEARCH_FETCH_CHARS: default fetch slice size, defaults to 4000
  • LOCAL_WEB_SEARCH_MAX_FETCH_CHARS: maximum fetch slice size, defaults to 20000
  • LOCAL_WEB_SEARCH_CRAWL_TIMEOUT_MS: Crawl4AI timeout, defaults to 45000
  • LOCAL_WEB_SEARCH_DOCKER_COMPOSE_FILE: optional compose file path
  • LOCAL_WEB_SEARCH_DOCKER_CONTAINER: optional SearXNG container name

The automatic Docker path uses a fast docker container inspect check. If the configured container is paused, it runs docker container unpause; if it is missing or stopped, it runs docker compose up -d --build searxng.

Publishing

The repository includes GitHub Actions for CI and PyPI publishing. The publish workflow runs on every push to main and on manual dispatch. It checks PyPI for existing local-web-search releases, bumps the patch version if necessary, commits that version bump back to main, then builds and publishes with PyPI Trusted Publishing.

To enable publishing, add a PyPI Trusted Publisher for:

  • PyPI project: local-web-search
  • Owner: maestromaximo
  • Repository: local-web-search
  • Workflow: publish.yml
  • Environment: pypi

License and Notices

Local Web Search is licensed under the Apache License 2.0.

SearXNG is a separate service licensed under the GNU Affero General Public License v3.0 or later. Crawl4AI is licensed under the Apache License 2.0. See THIRD_PARTY_NOTICES.md for details.

You are responsible for respecting robots.txt, website terms, copyright, authentication boundaries, privacy rules, and rate limits when searching and fetching public web content.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

local_web_search-0.1.0.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

local_web_search-0.1.0-py3-none-any.whl (25.5 kB view details)

Uploaded Python 3

File details

Details for the file local_web_search-0.1.0.tar.gz.

File metadata

  • Download URL: local_web_search-0.1.0.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for local_web_search-0.1.0.tar.gz
Algorithm Hash digest
SHA256 03bda390a7b8353fe9228f55f5dd42e7b47e6b10cd01fab67f4a4b5121f355fa
MD5 f3ba923a5cc0dc1f0319e2d8bf132e8f
BLAKE2b-256 0d8f1114b48df2f5a13ad623eb9f0817a83bdb3d3c58a9f559e66cd9a3901f48

See more details on using hashes here.

Provenance

The following attestation bundles were made for local_web_search-0.1.0.tar.gz:

Publisher: publish.yml on maestromaximo/local-web-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file local_web_search-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for local_web_search-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 771b9d3d7641a6ac05de818de21d80f9bb833b8aecf8798dbd74bac95d39a065
MD5 a53b74239a86c529b53021920a6e24ee
BLAKE2b-256 f972faf3a88d0aaa306911679eb451ab12b8039e53071dcd307f109ce64c62b3

See more details on using hashes here.

Provenance

The following attestation bundles were made for local_web_search-0.1.0-py3-none-any.whl:

Publisher: publish.yml on maestromaximo/local-web-search

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page