Unified CLI over Tavily, Firecrawl, and SerpAPI web-search/content SDKs

These details have not been verified by PyPI

Project links

Project description

web-search-tool

A unified Typer CLI over the Tavily, Firecrawl, and SerpAPI web-search/content SDKs — one command surface and one normalized output schema for humans, scripts, and agents.

One schema, every provider. search and every vertical return the same envelope shape (title, url, snippet, score) plus documented vertical-specific fields, regardless of which provider served the request.
A resilient fetch cascade. fetch walks markdownify → tavily → firecrawl, falling through on thin content, hard HTTP errors, and unsupported sites.
Agent-friendly. Deterministic exit codes, JSON when piped, an explicit escalation signal on total fetch failure, and byte-bounded/chunked output.

Install

cd web-search-tool
uv sync --extra dev
uv run web-search-tool --version

Configuration

Settings resolve with the precedence flag > env > config file > default.

Tool settings use the SEARCH_TOOL_ env prefix (e.g. SEARCH_TOOL_DEFAULT_PROVIDER=tavily, SEARCH_TOOL_TIMEOUT=30).
Provider keys use their conventional names: TAVILY_API_KEY, FIRECRAWL_API_KEY, SERPAPI_API_KEY.
Set a key to SKIP to opt a provider out of launch validation. Requesting a vertical that only that provider serves (e.g. scholar/jobs with SERPAPI_API_KEY=SKIP) fails with a specific SKIP-conflict error.
An optional TOML config lives at ~/.config/web-search-tool/config.toml (override with WEB_SEARCH_TOOL_CONFIG or the --config flag). See config.example.toml for every tunable.

# Minimal: keys in the environment
export SERPAPI_API_KEY=...        # required for the SerpAPI verticals
export TAVILY_API_KEY=...         # search, research, one cascade converter
export FIRECRAWL_API_KEY=SKIP     # opt out if you don't have a Firecrawl key

Setting	Env var	Default	Purpose
Default provider	`SEARCH_TOOL_DEFAULT_PROVIDER`	`serpapi`	Provider for generic `search`/`resolve`
Timeout	`SEARCH_TOOL_TIMEOUT`	`20`	Per-request timeout (seconds)
Default format	`SEARCH_TOOL_DEFAULT_FORMAT`	(auto)	`table` or `json`; auto-detects TTY when unset
Tavily key	`TAVILY_API_KEY`	—	Tavily search + deep research
Firecrawl key	`FIRECRAWL_API_KEY`	—	Firecrawl search + scrape converter
SerpAPI key	`SERPAPI_API_KEY`	—	All seven SerpAPI verticals
Sentry DSN	`SENTRY_DSN`	(off)	Enables error tracking + tracing when set
Sentry environment	`SENTRY_ENVIRONMENT`	`local`	Deployment tag on Sentry events
Trace sample rate	`SENTRY_TRACES_SAMPLE_RATE`	`1.0`	Transaction sampling, 0.0–1.0

The Sentry settings also live under a [monitoring] table in the config file and follow the same precedence. See Monitoring.

Output contract

stdout carries only the payload. All logs go to stderr; API keys never appear in logs, output, or written files.
Format: a TTY renders a rich table; a pipe emits JSON. --format/-f table|json overrides both; --compact removes JSON indentation.
Every payload is wrapped in an envelope — {command, ok, data} — so callers can branch on ok without parsing the body.

Exit codes are deterministic and category-specific:

Code	Name	Meaning
0	SUCCESS	Completed
1	GENERAL	Unclassified error
2	USAGE	Bad CLI usage
3	INPUT	Bad input / no URL resolved
4	NOT_FOUND	No results
5	NETWORK	Network/provider failure
6	TIMEOUT	Bounded wait exceeded
7	CONFIG	Missing/SKIP key, bad config

Commands

Search & verticals

web-search-tool search "python typer cli" --limit 10 [--provider tavily|firecrawl|serpapi]
web-search-tool news "us treasury yields" -n 5
web-search-tool jobs "platform engineer" --location "Boston, MA"
web-search-tool images "golden gate bridge"
web-search-tool videos "rust async tutorial"
web-search-tool reverse-image "https://example.com/photo.jpg"
web-search-tool scholar "continual learning llm" \
    --author "Bengio" --min-cites 50 --since 2020 --until 2024 --sort cites

All search commands accept --limit/-n, --format/-f, and --compact. --limit is enforced uniformly — SerpAPI verticals are capped client-side.

Scholar filters (scholar only):

Flag	Effect
`--author/-a NAME`	Native `author:"NAME"` operator
`--min-cites N`	Keep only results cited ≥ N times (client-side over a pool)
`--since YEAR` / `--until YEAR`	Native publication-year range
`--sort relevance\|date\|cites`	`relevance` (default), most-recent, or most-cited

Fetch — URL → markdown via the cascade

web-search-tool fetch "https://example.com/article"        # probe: first success wins
web-search-tool fetch URL --compare                        # all converters, labeled
web-search-tool fetch URL --raw-html                        # local markdownify only
web-search-tool fetch URL --order markdownify,firecrawl     # custom chain
web-search-tool fetch URL --max-bytes 20000 --offset 0      # bounded / chunked window
web-search-tool fetch URL --stdout                          # stream instead of writing a file

Default mode is a probe: walk the chain, first success wins; total failure raises chain-exhaustion carrying the exact signal Fetch chain completely failed, try using agent-browser.
--max-bytes overflow truncates and appends the marker *OUTPUT TRUNCATED PLS INCREASE THE CAP OR DO CHUNKED REQUEST*; combine with --offset for deterministic chunked continuation.
Without --stdout, output is written to a collision-safe slug file (<UTC-timestamp>-<slug>.md) in --out-dir (default: current directory).

Resolve — search, then fetch the top-N URLs

web-search-tool resolve "best rust web framework" --limit 5
web-search-tool resolve "fed rate decision" --vertical news -n 3
web-search-tool resolve "q" --stdout

Runs a search over --vertical (default search), then fetches each result URL through the cascade in auto-resolve mode: a URL that exhausts its chain records an escalation but never aborts the batch. Exits 0 if any URL resolved, 3 (INPUT) only if all failed.

Research — deep cited research via Tavily

web-search-tool research run "history of CRISPR patents"            # returns request_id
web-search-tool research run "q" --file notes.md --file data.json   # attach local context
web-search-tool research run "q" --wait --max-wait 300              # poll to completion
web-search-tool research status <request_id>                        # look up once

research run returns a request_id immediately; --wait polls until a terminal status or --max-wait (then exits TIMEOUT with a resume hint). research status reports status and, on completion, the cited markdown report with numbered sources. --file attaches up to 5 local .txt/.md/.json files as research context.

Monitoring

Sentry is off by default and fully optional — set a DSN to turn it on. When enabled, the tool reports errors and wraps each invocation in a transaction with child spans around every outbound call (provider search, each fetch-cascade tier, research launch/poll), tagged with non-secret data (provider, tier, status, result counts). API key values are never attached to spans, tags, or events.

Configure it via env or the [monitoring] table in the config file (same precedence as everything else):

export SENTRY_DSN="https://<key>@<org>.ingest.sentry.io/<project>"
export SENTRY_ENVIRONMENT=prod          # optional, defaults to "local"
export SENTRY_TRACES_SAMPLE_RATE=0.2    # optional, defaults to 1.0

Verify the setup end to end — emits a span tree plus one synthetic, harmless exception, then prints the captured event id:

web-search-tool test sentry

It exits with the config code (7) when no DSN is configured (nothing to test).

Development

uv run pytest                          # unit tests (live tests skipped without keys)
uv run pytest -m live                  # opt-in tests against real provider APIs
uv run ruff check . && uv run ruff format --check .
uv run mypy --strict src/web_search_tool
uv run pytest --cov=web_search_tool --cov-report=term-missing

See CONTRIBUTING.md for the full workflow.

License

MIT © Vlad Korolev

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

web_search_tool-0.1.0.tar.gz (172.8 kB view details)

Uploaded Jun 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

web_search_tool-0.1.0-py3-none-any.whl (51.8 kB view details)

Uploaded Jun 5, 2026 Python 3

File details

Details for the file web_search_tool-0.1.0.tar.gz.

File metadata

Download URL: web_search_tool-0.1.0.tar.gz
Upload date: Jun 5, 2026
Size: 172.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for web_search_tool-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a7c19dd8eff9f9c4432f17d5a368ddcb07b2c216458c1d231e8a82ccbc4890f4`
MD5	`b263b85d316b7810609a3ee1b33575c0`
BLAKE2b-256	`55bf967bc8441698c4aa28f8799e87543115b6d825985e56806a111acacbea9c`

See more details on using hashes here.

File details

Details for the file web_search_tool-0.1.0-py3-none-any.whl.

File metadata

Download URL: web_search_tool-0.1.0-py3-none-any.whl
Upload date: Jun 5, 2026
Size: 51.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.12

File hashes

Hashes for web_search_tool-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4436066999ad8faa64b045c42f3b82c75b9ce275ac95d92fa435990be8c73e96`
MD5	`9659debdaa7eb218e0d0e85c8bc6f7d3`
BLAKE2b-256	`3ec8678f1cf748305637b67e08525b03a4182573c08495e23b5e5cce0deb563c`

See more details on using hashes here.

web-search-tool 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

web-search-tool

Install

Configuration

Output contract

Commands

Search & verticals

Fetch — URL → markdown via the cascade

Resolve — search, then fetch the top-N URLs

Research — deep cited research via Tavily

Monitoring

Development

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes