Skip to main content

A search-and-fetch toolkit for AI agents — MCP server and standalone Agent Skills powered by DuckDuckGo and Jina Reader

Project description

Web Forager

Illustration of a determined scribe wielding a giant quill fighting a tangle of papers and monsters, with a duck in a cap at his side and stacks of documents and crates behind

PyPI Python Version License: MIT Downloads

The thing about information on the web is that it doesn't want to be found. It wants to hide behind cookie banners, keep itself to itself, and generally behave like a cat that knows it's time for the vet. Web Forager is the sort of dogged, slightly grubby assistant who goes out there anyway — accompanied by a duck of questionable temperament — rummages through DuckDuckGo, tries Exa when DuckDuckGo pretends not to be home, fetches pages via Jina Reader, and when Jina is having one of its days, simply grabs them by hand. The results come back neatly converted for LLM consumption, which is to say, in a format that would make a librarian weep with either joy or despair, depending on the librarian.

A search-and-fetch toolkit for AI agents, available as an MCP server and as standalone Agent Skills:

  1. Search the web via DuckDuckGo
  2. Search news via DuckDuckGo News
  3. Fetch and convert web pages via Jina Reader

Also ships five Agent Skills that work independently — no MCP required — for research, fact-checking, news monitoring, competitive analysis, and technology evaluation.

Features

  • DuckDuckGo web search with safe search controls
  • DuckDuckGo news search with date-sorted results and source attribution
  • Fetch and convert URLs to markdown or JSON using Jina Reader
  • LLM-friendly output format option for search results
  • CLI for search, news, fetch, serve, and version commands
  • MCP tools for LLM integration
  • Five standalone Agent Skills for specialized research workflows
  • Docker support for containerized deployment

Installation

Prerequisites

  • Python 3.10 or higher
  • uv (recommended) or pip

Install from PyPI (recommended)

# Using uv (recommended)
uv pip install web-forager

# Or using pip
pip install web-forager

Install with UVX (for Claude Desktop)

# Install UVX if you haven't already
pip install uvx

# Install the Web Forager package
uvx install web-forager

Install from source

For development or to get the latest changes:

# Clone the repository
git clone https://github.com/CyranoB/web-forager.git
cd web-forager

# Install with uv (recommended)
uv pip install -e .

# Or with pip
pip install -e .

Docker

Build and run with Docker:

# Build the image (uses version from latest git tag)
docker build --build-arg VERSION=$(git describe --tags --abbrev=0 | sed 's/^v//') -t web-forager .

# Or specify a version manually
docker build --build-arg VERSION=2.0.2 -t web-forager .

# Run the server (MCP servers use STDIO, so typically run within an MCP client)
docker run -i web-forager

Usage

Starting the Server (STDIO Mode)

# Start the server in STDIO mode (for use with MCP clients like Claude)
web-forager serve

# Enable debug logging
web-forager serve --debug

Testing the Search Tool

# Search DuckDuckGo (JSON output, default)
web-forager search "your search query" --max-results 5 --safesearch moderate

# Search with LLM-friendly text output
web-forager search "your search query" --output-format text

Testing the News Search Tool

# Search DuckDuckGo news (JSON output, default)
web-forager news "your search query" --max-results 10 --safesearch moderate

# Search with LLM-friendly text output
web-forager news "your search query" --output-format text

Testing the Fetch Tool

# Fetch a URL and return markdown
web-forager fetch "https://example.com" --format markdown

# Fetch a URL and return JSON
web-forager fetch "https://example.com" --format json

# Limit output length
web-forager fetch "https://example.com" --max-length 2000

# Include generated image alt text
web-forager fetch "https://example.com" --with-images

Version Information

# Show version
web-forager version

# Show detailed version info
web-forager version --debug

MCP Client Setup

This MCP server works with any MCP-compatible client. Use one of the setups below.

Python 3.10-3.13 is supported (3.14 not yet). Use --python ">=3.10,<3.14" with uvx to enforce. Verified with Python 3.12 and 3.13.

Claude Desktop

  1. Open Claude Desktop > Settings > Developer > Edit Config.
  2. Edit the config file:
    • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
    • Windows: %APPDATA%\Claude\claude_desktop_config.json
  3. Add the server config under mcpServers:
     {
       "mcpServers": {
         "web-forager": {
           "command": "uvx",
           "args": ["--python", ">=3.10,<3.14", "web-forager", "serve"]
         }
       }
     }
    
  4. Restart Claude Desktop.

Claude Code

Add a local stdio server:

claude mcp add --transport stdio web-forager -- uvx --python ">=3.10,<3.14" web-forager serve

Optional: claude mcp list to verify, or claude mcp add-from-claude-desktop to import.

Codex (CLI + IDE)

Add via CLI:

codex mcp add web-forager -- uvx --python ">=3.10,<3.14" web-forager serve

Or configure ~/.codex/config.toml:

[mcp_servers.web-forager]
command = "uvx"
args = ["--python", ">=3.10,<3.14", "web-forager", "serve"]

OpenCode

Add to your OpenCode config (~/.config/opencode/opencode.json or project opencode.json):

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "web-forager": {
      "type": "local",
      "command": ["uvx", "--python", ">=3.10,<3.14", "web-forager", "serve"],
      "enabled": true
    }
  }
}

Or run opencode mcp add and follow the prompts.

Cursor

Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):

{
  "mcpServers": {
    "web-forager": {
      "command": "uvx",
      "args": ["--python", ">=3.10,<3.14", "web-forager", "serve"]
    }
  }
}

Verify with:

cursor-agent mcp list

MCP Tools

The server exposes these tools to MCP clients:

@mcp.tool()
def duckduckgo_search(
    query: str,
    max_results: int = 5,
    safesearch: str = "moderate",
    output_format: str = "json"
) -> list | str:
    """Search DuckDuckGo for the given query."""
@mcp.tool()
def duckduckgo_news_search(
    query: str,
    max_results: int = 10,
    safesearch: str = "moderate",
    output_format: str = "json"
) -> list | str:
    """Search DuckDuckGo for recent news articles."""
@mcp.tool()
def jina_fetch(url: str, format: str = "markdown", max_length: int | None = None, with_images: bool = False) -> str | dict:
    """Fetch a URL and convert it using Jina Reader."""

Example usage in an MCP client:

# This is handled automatically by the MCP client
results = duckduckgo_search("Python programming", max_results=3)
news = duckduckgo_news_search("AI regulation 2026", max_results=5)
content = jina_fetch("https://example.com", format="markdown")

# Get LLM-friendly text output
text_results = duckduckgo_search("Python programming", output_format="text")

API

Tool 1: Search

  • Tool Name: duckduckgo_search
  • Description: Search the web using DuckDuckGo (powered by the ddgs library)

Parameters

  • query (string, required): The search query
  • max_results (integer, optional, default: 5): Maximum number of search results to return
  • safesearch (string, optional, default: "moderate"): Safe search setting ("on", "moderate", or "off")
  • output_format (string, optional, default: "json"): Output format - "json" for structured data, "text" for LLM-friendly formatted string

Response

JSON format (default): A list of dictionaries:

[
  {
    "title": "Result title",
    "url": "https://example.com",
    "snippet": "Text snippet from the search result"
  }
]

Text format: An LLM-friendly formatted string:

Found 3 search results:

1. Result title
   URL: https://example.com
   Summary: Text snippet from the search result

2. Another result
   URL: https://example2.com
   Summary: Another snippet

Tool 2: News Search

  • Tool Name: duckduckgo_news_search
  • Description: Search for recent news articles using DuckDuckGo (powered by the ddgs library)

Parameters

  • query (string, required): The news search query
  • max_results (integer, optional, default: 10): Maximum number of news results to return
  • safesearch (string, optional, default: "moderate"): Safe search setting ("on", "moderate", or "off")
  • output_format (string, optional, default: "json"): Output format - "json" for structured data, "text" for LLM-friendly formatted string

Response

JSON format (default): A list of dictionaries:

[
  {
    "title": "News headline",
    "url": "https://example.com/article",
    "snippet": "Article summary text",
    "date": "2026-03-01T12:00:00+00:00",
    "source": "News Outlet"
  }
]

Text format: An LLM-friendly formatted string:

Found 3 news results:

1. News headline
   URL: https://example.com/article
   Date: 2026-03-01T12:00:00+00:00
   Source: News Outlet
   Summary: Article summary text

Tool 3: Fetch

  • Tool Name: jina_fetch
  • Description: Fetch a URL and convert it to markdown or JSON using Jina Reader

Parameters

  • url (string, required): The URL to fetch and convert
  • format (string, optional, default: "markdown"): Output format ("markdown" or "json")
  • max_length (integer, optional): Maximum content length to return (None for no limit)
  • with_images (boolean, optional, default: false): Whether to include image alt text generation

Response

For markdown format: a string containing markdown content

For JSON format: a dictionary with the structure:

{
  "url": "https://example.com",
  "title": "Page title",
  "content": "Markdown content"
}

Agent Skills

This repo includes five Agent Skills that orchestrate the MCP's search and fetch tools into specialized workflows. Each skill follows the open Agent Skills specification and works with Claude Code, Codex CLI, and other compatible agents.

All skills work without the MCP configured — they use the ddgs Python library and the Jina Reader HTTP API directly. If MCP tools are available in the session, they prefer those automatically.

Install via Plugin Marketplace (recommended)

Register this repo as a plugin marketplace, then install all five skills at once:

# Add the marketplace
/plugin marketplace add CyranoB/web-forager

# Install all 5 skills
/plugin install forager-skills@web-forager

Install individual skills

Claude Code:

# Install a specific skill
claude install-skill ./skills/web-research

# Or from GitHub
claude install-skill github:CyranoB/web-forager/skills/web-research

Manual (any agent): Copy a skill folder from skills/ into your agent's skills directory (e.g., ~/.claude/skills/ or .claude/skills/).

Available skills

Skill Triggers on Output
web-research "research X", "look up X", "deep dive into X" Adaptive report (quick answer / standard / deep dive) with citations
fact-check "is it true that X", "verify this claim", "fact check this" Verdict (Confirmed -> False) with evidence for and against
news-monitor "what's new with X", "recent news about X", "catch me up on X" Chronological news briefing with headlines and details
competitive-intel "compare X vs Y", "which is better", "help me choose between" Comparison matrix with pricing, pros/cons, and recommendation
tech-advisor "should we adopt X", "is X production ready", "X vs Y for my needs" Maturity scorecard (Adopt/Trial/Assess/Hold) or product comparison with evidence

Notes

  • Search and news search use the ddgs package (renamed from duckduckgo-search).
  • Fetch uses the Jina Reader API at https://r.jina.ai/.

Contributing

Contributions are welcome! Here's how you can contribute:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Support

If you encounter any issues or have questions, please open an issue.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

web_forager-3.0.1.tar.gz (1.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

web_forager-3.0.1-py3-none-any.whl (17.3 kB view details)

Uploaded Python 3

File details

Details for the file web_forager-3.0.1.tar.gz.

File metadata

  • Download URL: web_forager-3.0.1.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for web_forager-3.0.1.tar.gz
Algorithm Hash digest
SHA256 2421641e3b748a0bf88fe8ea35d805ff771d8e752df596d94625b75f79256115
MD5 2b0b68df63b72af4c6105013217818f9
BLAKE2b-256 2926e9dc6cee105ba0c63c6813d15f3de4f43e3b328a35b32b7eb7c9a3dbe981

See more details on using hashes here.

File details

Details for the file web_forager-3.0.1-py3-none-any.whl.

File metadata

  • Download URL: web_forager-3.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for web_forager-3.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1b050881a0edd8fd605a9e27f4075e341f18a02edafc8a345b5961c159a0c6c4
MD5 d6f2665a891fe908cc100dae3d9cb802
BLAKE2b-256 83995dc92ea24908e5215e58f9817bf4c807c87bc0c2f5fbd9ba855f9e97d894

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page