Skip to main content

One endpoint, every free search API. Search broker for AI agents with automatic fallback, RRF ranking, and budget enforcement. The LiteLLM of web search.

Project description

Argus

Python 3.11+ License: MIT PyPI CI MCP Server

Multi-provider web search broker for AI agents. Routes across SearXNG, DuckDuckGo, Brave, Serper, Tavily, Exa, and more — using RRF fusion, content extraction, and budget-aware routing so you don't waste your free search credits.

Features at a glance:

  • Multi-provider search — 11 providers, one API, free-first tier routing
  • 5,000+ free queries/month — automatic budget tracking, exhausted providers skipped
  • Content extraction — 9-step fallback chain with quality gates (local + external)
  • Multi-turn sessions — pass session_id for conversational search refinement
  • 4 search modes — discovery, research, recovery, grounding
  • Dead URL recovery — first-class /recover-url endpoint with archive fallbacks
  • 4 integration paths — HTTP API, CLI, MCP server, Python SDK

Built for AI agent builders, RAG infra, and ops teams who don't want to hand-wire search APIs.

Quickstart

Mode 1: Local CLI (zero config)

pip install argus-search && argus search -q "python web frameworks"

That's it. DuckDuckGo handles the search — no accounts, no keys, no containers. You get unlimited free search from your laptop right now. Add API keys whenever you want more providers, or don't.

argus extract -u "https://example.com/article"       # extract clean text from any URL

Works on any machine with Python 3.11+ — laptop, Mac Mini, Raspberry Pi, cloud VM. Nothing to host.

For MCP (Claude Code, Cursor, VS Code):

pipx install argus-search[mcp] && argus mcp serve

Then add to your MCP config:

{"mcpServers": {"argus": {"command": "argus", "args": ["mcp", "serve"]}}}

One command to install, one JSON block to connect. No server to run, no keys to configure.

Mode 2: Full Stack Server

Got a Raspberry Pi running Pi-hole? A Mac Mini on your desk? An old laptop? That's enough to run the full stack — SearXNG (your own private search engine) plus local JS-rendering content extraction.

docker compose up -d    # SearXNG + Argus
What you have What you get
Any machine with Python 3.11+ DuckDuckGo + API providers (no server)
Raspberry Pi 4 / old laptop (4GB+) Everything — SearXNG, all providers, Crawl4AI
Mac Mini M1+ (8GB+) Full stack with headroom
Free cloud VM (1GB) SearXNG + search providers (skip Crawl4AI)

SearXNG takes 512MB of RAM and gives you a private Google-style search engine that nobody can rate-limit, block, or charge for. It runs alongside Pi-hole on hardware millions of people already own.

Providers

Provider Credit type Free capacity Setup
DuckDuckGo Free (scraped) Unlimited None
SearXNG Free (self-hosted) Unlimited Docker
GitHub Free (API) Unlimited None (token for higher rate limit)
Brave Search Monthly recurring 2,000 queries/month dashboard
Tavily Monthly recurring 1,000 queries/month signup
Exa Monthly recurring 1,000 queries/month signup
Linkup Monthly recurring 1,000 queries/month signup
Serper One-time signup 2,500 credits signup
Parallel AI One-time signup 4,000 credits signup
You.com One-time signup $20 credit platform
Valyu One-time signup $10 credit platform

5,000 free queries/month from the four recurring providers. Three providers need no API key at all. Routing priority: Tier 0 (free: SearXNG, DuckDuckGo, GitHub) → Tier 1 (monthly: Brave, Tavily, Exa, Linkup) → Tier 2 (one-time: Serper, Parallel, You.com, Valyu, SearchAPI). Budget-exhausted providers are skipped automatically.

HTTP API

All endpoints prefixed with /api. OpenAPI docs at http://localhost:8000/docs.

# Search
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "python web frameworks", "mode": "discovery", "max_results": 5}'

# Multi-turn search (conversational refinement)
curl -X POST http://localhost:8000/api/search \
  -H "Content-Type: application/json" \
  -d '{"query": "what about async?", "session_id": "my-session"}'

# Extract content from a working URL
curl -X POST http://localhost:8000/api/extract \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/article"}'

# Recover a dead or moved URL
curl -X POST http://localhost:8000/api/recover-url \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/old-page", "title": "Example Article"}'

# Health & budgets
curl http://localhost:8000/api/health/detail
curl http://localhost:8000/api/budgets

Search modes

Mode Use for Example
discovery Related pages, canonical sources "Find the official docs for X"
research Broad exploratory retrieval "Latest approaches to Y?"
recovery Finding moved/dead content "This URL is 404"
grounding Fact-checking with live sources "Verify this claim about Z"

Tier-based routing always applies first. Within each tier, the mode selects provider order.

Response format

{
  "query": "python web frameworks",
  "mode": "discovery",
  "results": [
    {"url": "https://fastapi.tiangolo.com", "title": "FastAPI", "snippet": "Modern Python web framework", "score": 0.942}
  ],
  "total_results": 1,
  "cached": false,
  "traces": [
    {"provider": "duckduckgo", "status": "success", "results_count": 5, "latency_ms": 312}
  ]
}

Each result includes url, title, snippet, domain, provider, and score. The traces array shows which providers were called and their outcomes.

Budgets

{
  "budgets": {
    "brave": {"remaining": 1847, "monthly_usage": 153, "usage_count": 153, "exhausted": false},
    "duckduckgo": {"remaining": 0, "monthly_usage": 0, "usage_count": 42, "exhausted": false}
  },
  "token_balances": {"jina": 9833638}
}

Each provider tracks usage per calendar month. When a provider hits its budget, Argus skips it and moves to the next tier. Free providers (SearXNG, DuckDuckGo, GitHub) have no limit. Set ARGUS_*_MONTHLY_BUDGET_USD to enforce custom limits per provider.

Integration

CLI

argus search -q "python web framework"              # zero-config, uses DuckDuckGo
argus search -q "python web framework" --mode research -n 20
argus search -q "fastapi" --session my-session       # multi-turn context
argus extract -u "https://example.com/article"       # extract clean text
argus extract -u "https://example.com/article" -d nytimes.com  # auth extraction
argus recover-url -u "https://dead.link" -t "Title"
argus health                                         # provider status
argus budgets                                        # budget + token balances
argus set-balance -s jina -b 9833638                 # track token balance
argus test-provider -p brave                         # smoke-test a provider
argus serve                                          # start API server
argus mcp serve                                      # start MCP server

All commands support --json for structured output.

How sessions work

Pass session_id to any search call. Argus stores each query and extracted URL in a SQLite-backed session. Reusing the same session_id gives the broker context from prior queries — follow-up searches are automatically refined using earlier conversation context. Sessions persist across restarts. Omit session_id for stateless, one-shot searches.

MCP

Add to your MCP client config:

{
  "mcpServers": {
    "argus": {
      "command": "argus",
      "args": ["mcp", "serve"]
    }
  }
}

Works with Claude Code, Cursor, VS Code, and any MCP-compatible client. For remote access via SSE:

{
  "mcpServers": {
    "argus": {
      "command": "argus",
      "args": ["mcp", "serve", "--transport", "sse", "--host", "127.0.0.1", "--port", "8001"]
    }
  }
}

Available tools: search_web, extract_content, recover_url, expand_links, search_health, search_budgets, test_provider, cookie_health, valyu_answer

Python

from argus.broker.router import create_broker
from argus.models import SearchQuery, SearchMode
from argus.extraction import extract_url

broker = create_broker()

response = await broker.search(
    SearchQuery(query="python web frameworks", mode=SearchMode.DISCOVERY, max_results=10)
)
for r in response.results:
    print(f"{r.title}: {r.url} (score: {r.score:.3f})")

content = await extract_url(response.results[0].url)
print(content.title)
print(content.text)

Content Extraction

Argus tries up to nine methods to extract content from any URL: first local (trafilatura, Crawl4AI, Playwright), then external APIs (Jina, Valyu Contents, Firecrawl, You.com, Wayback, archive.is). Each attempt is quality-checked for garbage output. See docs/providers.md for the full extractor comparison.

Extract gets the full text of a working URL. Recover-URL finds alternatives when a URL is dead, paywalled, or radically changed by querying archival sources (Wayback, archive.is) and running a question-guided extraction loop.

Architecture

Caller (CLI/HTTP/MCP/Python) → SearchBroker → tier-sorted providers → RRF ranking → response
                                     ↕ SessionStore (optional)
                            Extractor (on demand) → 9-step fallback chain with quality gates
Module Responsibility
argus/broker/ Tier-based routing, ranking, dedup, caching, health, budgets
argus/providers/ Provider adapters (one per search API)
argus/extraction/ 9-step URL extraction fallback chain with quality gates
argus/sessions/ Multi-turn session store and query refinement
argus/api/ FastAPI HTTP endpoints
argus/cli/ Click CLI commands
argus/mcp/ MCP server for LLM integration
argus/persistence/ PostgreSQL query/result storage

Add new providers or extractors with a single adapter file. See CONTRIBUTING.md for the interface.

Configuration

All config via environment variables. See .env.example for the full list. Missing keys degrade gracefully — providers are skipped, not errors.

Variable Default Description
ARGUS_SEARXNG_BASE_URL http://127.0.0.1:8080 SearXNG endpoint
ARGUS_BRAVE_API_KEY Brave Search API key
ARGUS_SERPER_API_KEY Serper API key
ARGUS_TAVILY_API_KEY Tavily API key
ARGUS_EXA_API_KEY Exa API key
ARGUS_LINKUP_API_KEY Linkup API key
ARGUS_PARALLEL_API_KEY Parallel AI API key
ARGUS_YOU_API_KEY You.com API key
ARGUS_VALYU_API_KEY Valyu API key (search, contents, answer)
ARGUS_FIRECRAWL_API_KEY Firecrawl API key (content extraction)
ARGUS_GITHUB_API_KEY GitHub token (higher rate limit)
ARGUS_*_MONTHLY_BUDGET_USD 0 (unlimited) Query-count budget per provider
ARGUS_CRAWL4AI_ENABLED false Enable Crawl4AI extraction step
ARGUS_YOU_CONTENTS_ENABLED false Enable You.com Contents API extraction
ARGUS_CACHE_TTL_HOURS 168 Result cache TTL

FAQ

How is this different from calling Tavily/Serper directly? Argus calls them for you — plus 9 other providers. You get one ranked, deduplicated result set instead of managing multiple API keys and stitching results together. Free providers are tried first, so you only burn credits when needed.

Can I run only one provider? Yes. Set only the API key for the provider you want. All others are silently skipped. For zero-config, just install and go — DuckDuckGo handles everything with no keys.

Do I need Docker? No. pip install argus-search works immediately on any machine with Python 3.11+. Docker is only needed for SearXNG (self-hosted search) or Crawl4AI (local JS rendering).

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

argus_search-1.3.2.tar.gz (86.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

argus_search-1.3.2-py3-none-any.whl (97.8 kB view details)

Uploaded Python 3

File details

Details for the file argus_search-1.3.2.tar.gz.

File metadata

  • Download URL: argus_search-1.3.2.tar.gz
  • Upload date:
  • Size: 86.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for argus_search-1.3.2.tar.gz
Algorithm Hash digest
SHA256 764cb923425126804bf61ace798ddf800c946c5fa313cd192ea3f8df85b798eb
MD5 36424b243f4b6de224968c2b76efa009
BLAKE2b-256 ac6893566c072c69c73e2f93831e3e220b596c9deef239b6b960dd7c1244cb07

See more details on using hashes here.

File details

Details for the file argus_search-1.3.2-py3-none-any.whl.

File metadata

  • Download URL: argus_search-1.3.2-py3-none-any.whl
  • Upload date:
  • Size: 97.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for argus_search-1.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3494e7808584020171c95afe45f19f4fab8f7cae62b4101444fb56101e23bd0e
MD5 cdce655cd6b9fc577f1a1422453b19f3
BLAKE2b-256 01b34744f905b2042fd9b2d4d084ee46d2bc639506cfb0eb136f0e4ffa712cd5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page