Skip to main content

OmniScout CLI: local-first browser automation, semantic search, and research for AI agents

Project description

OmniScout

Local-first browser automation, semantic search, and research for AI agents.

No cloud APIs. No hosted browser sessions. No MCP yet. No SDK.

The CLI is the interface. Install the omniscout command and drive everything from the terminal or via JSON (--json / OMNISCOUT_JSON=1).

scout is a short alias. harness is a legacy dev alias kept for compatibility.


Install

Requires Python 3.11+ and Google Chrome (macOS default: /Applications/Google Chrome.app).

pip install omniscout
omniscout install                    # verify Chrome + prefetch embedding model (~80MB)
omniscout install --skill            # optional: drop agent skill into Claude/Cursor/Codex/Antigravity

If Chrome is not installed, add --bundled to download Playwright Chromium (~190MB).

Search commands auto-start the local daemon and keep the embedding model loaded in RAM across invocations — no manual warm-up step required.


Features

Browser automation (daemon-backed)

Long-lived daemon at 127.0.0.1:7720 for sub-second per-action latency.

  • Playwright backend (default) — local Chrome with persistent profiles
  • Chrome extension backend (opt-in) — drives your real running Chrome via chrome.debugger; same JSON vocabulary, real cookies and logins
  • Atomic actions: navigate, snapshot, click, fill, scroll, key, hover, screenshot, pdf, eval, wait, tabs, network, upload, login, captcha, close
  • Stable @eN refs from the accessibility tree (preferred over CSS selectors)
  • Persistent profiles — log in once, stay logged in
  • CAPTCHA: local-first manual handoff; optional 2captcha / capsolver solvers
  • Network capture (CDP) with URL filters and per-request inspection
  • Session restore across daemon restarts

Semantic search

  • DuckDuckGo HTML search with optional local embedding rerank
  • Sources: ddg, index (local crawl corpus), memory (remembered visits), hybrid (memory + DDG)
  • omniscout answer — grounded one-sentence answers: direct DDG answers first (snippets, Search Assist), then extractive parsing, local LLM, and limited crawl (auto, fast, balanced, deep; extractive fallback)

Warm embedding model

Search, research, and memory commands route embeddings through the daemon. The sentence-transformers model (all-MiniLM-L6-v2) loads once (~2s) and stays hot. omniscout daemon status reports embed_model_loaded.

Content extraction

Fetch URLs to clean Markdown, plain text, or structured JSON via trafilatura + markdownify. On-disk HTML cache.

Research pipeline

Multi-step: search → crawl → extract → embed → rerank → summarize.

Browser memory

Remember visits and notes; semantic search over your browsing history.

  • omniscout remember <url> — visit, extract, index
  • omniscout memory list|show|note|delete|stats|clear

Workflow shortcuts

Top-level commands for agent ergonomics:

  • omniscout open <url|index> — open URL or latest search result
  • omniscout snapshot, omniscout context, omniscout reset
  • omniscout workflow export — JSON steps from workflow state + action history

Replay & observability

Every daemon action is logged to $OMNISCOUT_DATA_DIR/daemon/actions.jsonl:

  • omniscout daemon trace — recent activity table or JSON
  • omniscout daemon replay <action_id> — re-run a single action
  • omniscout daemon watch — live SSE event stream
  • Top-level omniscout replay action-<id> and omniscout replay session-<name>

Benchmarks

  • omniscout benchmark answers — latency + correctness matrix over answer modes
  • omniscout benchmark startup — CLI process launch overhead

Quickstart

# Search
omniscout search "local-first browser agents"
omniscout answer "who is the president" --depth balanced

# Extract
omniscout extract https://example.com

# Browser (daemon auto-starts)
omniscout browser navigate https://example.com
omniscout browser snapshot --refs-only
omniscout browser click '@e1'
omniscout browser screenshot --out /tmp/state.png
omniscout browser close --all

# Research
omniscout research "state of local AI agents in 2026"

# Profiles & sessions
omniscout profile create work
omniscout browser open https://news.ycombinator.com --profile work --headful
omniscout session start --headful

Optional warm-up before a batch of searches:

omniscout warmup

JSON output (for agents)

Every command supports --json. Set OMNISCOUT_JSON=1 to make JSON the default for an entire shell session. Logs go to stderr; stdout is the structured result.

export OMNISCOUT_JSON=1
omniscout search "robotics simulators" --limit 5
omniscout browser navigate https://example.com --session demo

Direct HTTP (no CLI wrapper):

curl -s -X POST http://127.0.0.1:7720/command \
  -H 'Content-Type: application/json' \
  -d '{"action":"navigate","args":{"url":"https://example.com"},"session":"demo"}'

Architecture

omniscout CLI ──HTTP POST /command──▶ omniscout daemon (127.0.0.1:7720)
     │                                      ├─ Playwright backend
     │                                      ├─ Extension backend (opt-in)
     │                                      └─ Embed service (warm model)
     └── Search / Extract / Research engines (local Qdrant + DDG)

Python package layout (for contributors):

cli/omniscout/
  app.py              # Typer root (binary: omniscout)
  commands/           # CLI sub-commands
  daemon/             # HTTP server, backends, replay, events
  engines/            # browser, search, research, extractor, crawler
  store/              # SQLite cache, sessions, workflow, memory
  models.py           # pydantic JSON contract

On-disk state

Path Purpose
profiles/ Persistent Chrome user-data-dirs
qdrant/ Embedded vector index
models/sentence-transformers/ Prefetched embedding model
memory.sqlite Browser memory (visits + notes)
sessions.sqlite Long-lived browser session registry
cache/pages/ Content-hashed HTML cache
daemon/ PID, port, logs, action history, session restore

Default locations:

  • macOS — ~/Library/Application Support/omniscout/
  • Linux — ~/.local/share/omniscout/

Override with OMNISCOUT_DATA_DIR, OMNISCOUT_CONFIG_DIR, OMNISCOUT_CACHE_DIR. Legacy HARNESS_* names are still accepted.


Configuration

config.toml (in config dir):

default_source = "ddg"
search_limit = 10
research_results = 8
request_throttle_seconds = 1.0
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
embedding_local_only = true
browser_channel = "chrome"
summary_sentences = 6

Environment variables

Variable Purpose
OMNISCOUT_JSON=1 Force JSON output on every command
OMNISCOUT_EMBED_DAEMON=1 Route embeds through daemon (default on)
OMNISCOUT_DAEMON_AUTO_START=0 Don't auto-start daemon
OMNISCOUT_DAEMON_PORT Daemon port (default 7720)
OMNISCOUT_DATA_DIR Override data directory
OMNISCOUT_EMBED_LOCAL_ONLY=0 Allow runtime Hugging Face fetches
TWOCAPTCHA_API_KEY CAPTCHA solver API key

Legacy HARNESS_* equivalents work for all of the above.


Why local Chrome?

Using system Chrome gives you real cookies, login state, extensions, and the same fingerprint as daily browsing — without a separate ~190MB Chromium download. Falls back to Playwright Chromium automatically when Chrome is unavailable.


License

Modified MIT — see LICENSE. Products built on OmniScout must prominently display Powered by OmniScout on the user interface.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omniscout-0.2.6.tar.gz (130.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omniscout-0.2.6-py3-none-any.whl (164.1 kB view details)

Uploaded Python 3

File details

Details for the file omniscout-0.2.6.tar.gz.

File metadata

  • Download URL: omniscout-0.2.6.tar.gz
  • Upload date:
  • Size: 130.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.6.tar.gz
Algorithm Hash digest
SHA256 423f0b31c033e3945114975f1dc1f0b9597f5ec74fea9dab9c48bd3cdfe44327
MD5 dc81ea24cc8e838a4bd7474ae2ed4d06
BLAKE2b-256 6f8f8fbf922126f7fa4a14919a018aca2dbb96821d0472e3836e619ad8099902

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.6.tar.gz:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file omniscout-0.2.6-py3-none-any.whl.

File metadata

  • Download URL: omniscout-0.2.6-py3-none-any.whl
  • Upload date:
  • Size: 164.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for omniscout-0.2.6-py3-none-any.whl
Algorithm Hash digest
SHA256 ba18dce1dfb70b35cb3ad7f19362dae4395edc76daccde437d79fa57ba61b7f5
MD5 cbb5cf7deeffa8e9b33c5dc6cd9620b2
BLAKE2b-256 803f8d3d60993dd0fc90ce16938fb9e78c6891aaa7f6d68ee47a3e6f7e625562

See more details on using hashes here.

Provenance

The following attestation bundles were made for omniscout-0.2.6-py3-none-any.whl:

Publisher: pypi-publish.yml on sriramramnath/omniscout

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page