OmniScout CLI: local-first multi-browser automation, semantic search, and research for AI agents
Project description
OmniScout
Local-first browser automation, semantic search, and research for AI agents.
Website: omniscout.xyz · Docs: docs.omniscout.xyz
No cloud APIs. No hosted browser sessions. No MCP yet. No SDK.
The CLI is the interface. Install the omniscout command and drive everything from the terminal or via JSON (--json / OMNISCOUT_JSON=1).
scout is a short alias. harness is a legacy dev alias kept for compatibility.
Install
Requires Python 3.11+ and a Chromium-based browser (Chrome by default; Edge, Brave, Vivaldi, and others supported — see Settings below).
# One-liner (pip + browser + models + agent skill)
curl -fsSL https://omniscout.xyz/install.sh | bash
# Or step by step
pip install omniscout
omniscout install --skill # browser + models + agent skill files
omniscout install --browser brave # non-interactive browser choice
omniscout settings browsers # list supported / installed browsers
If no Chromium browser is installed, add --bundled to download Playwright Chromium (~190MB).
Search commands auto-start the local daemon and keep the embedding model loaded in RAM across invocations — no manual warm-up step required.
Features
Browser automation (daemon-backed)
Long-lived daemon at 127.0.0.1:7720 for sub-second per-action latency.
- Playwright backend (default) — local Chrome with persistent profiles
- Chrome extension backend (opt-in) — drives your real running Chrome via
chrome.debugger; same JSON vocabulary, real cookies and logins - Atomic actions:
navigate,snapshot,click,fill,type,paste,select,scroll,key,hover,back,forward,reload,get,is,wait,mouse,screenshot,pdf,eval, tabs,network,console,upload,login,captcha,close - Stable
@eNrefs from the accessibility tree (preferred over CSS selectors); usesnapshot --refs-only— no separaterefsalias - Persistent profiles — log in once, stay logged in
- CAPTCHA: local-first manual handoff; optional 2captcha / capsolver solvers
- Network + console capture with
listand incrementaltailfor agent debugging - Session restore across daemon restarts
Semantic search
- DuckDuckGo HTML search with optional local embedding rerank
- Sources:
ddg,index(local crawl corpus),memory(remembered visits),hybrid(memory + DDG) omniscout answer— grounded one-sentence answers: direct DDG answers first (snippets, Search Assist), then extractive parsing, local LLM, and limited crawl (auto,fast,balanced,deep; extractive fallback)
Warm embedding model
Search, research, and memory commands route embeddings through the daemon.
The sentence-transformers model (all-MiniLM-L6-v2) loads once (~2s) and stays
hot. omniscout daemon status reports embed_model_loaded.
Content extraction
Fetch URLs to clean Markdown, plain text, structured JSON fields, or full JSON metadata via trafilatura + markdownify.
--format structured— auto-extract everything found (company, pricing, socials, docs/blog/careers URLs, contact info, labeled fields). NLP only, no LLM. Empty fields omitted. Quiet stdout (fields JSON only).--query/-q— search DuckDuckGo, crawl top hits, follow same-host links (--depth, default 3), merge pages, extract structured fields (no URL required).--fields company,pricing,...— limit structured output to specific keys--data— include fullExtractResultplus stderr diagnostics
omniscout extract https://example.com --format structured
omniscout extract https://example.com --format structured --fields twitter,pricing
omniscout extract -q "SpaceX founder" --format structured --fields founder
Research pipeline
Multi-step: search → crawl → extract → embed → rerank → summarize.
Browser memory
Remember visits and notes; semantic search over your browsing history.
omniscout remember <url>— visit, extract, indexomniscout memory list|show|note|delete|stats|clear
Workflow shortcuts
Top-level commands for agent ergonomics:
omniscout open <url|index>— open URL or latest search resultomniscout snapshot,omniscout context,omniscout resetomniscout workflow export— JSON steps from workflow state + action history
Replay & observability
Every daemon action is logged to $OMNISCOUT_DATA_DIR/daemon/actions.jsonl:
omniscout daemon trace— recent activity table or JSONomniscout daemon replay <action_id>— re-run a single actionomniscout daemon watch— live SSE event stream- Top-level
omniscout replay action-<id>andomniscout replay session-<name>
Benchmarks
omniscout benchmark answers— latency + correctness matrix over answer modesomniscout benchmark startup— CLI process launch overhead
Quickstart
# Search
omniscout search "local-first browser agents"
omniscout answer "who is the president" --depth balanced
# Extract
omniscout extract https://example.com
omniscout extract https://example.com --format structured
# Browser (daemon auto-starts)
omniscout browser navigate https://example.com
omniscout browser snapshot --refs-only
omniscout browser click '@e1'
omniscout browser screenshot --out /tmp/state.png
omniscout browser close --all
# Research
omniscout research "state of local AI agents in 2026"
# Profiles & sessions
omniscout profile create work
omniscout browser open https://news.ycombinator.com --profile work --headful
omniscout session start --headful
Optional warm-up before a batch of searches:
omniscout warmup
JSON output (for agents)
Every command supports --json. Set OMNISCOUT_JSON=1 to make JSON the default
for an entire shell session. Logs go to stderr; stdout is the structured result.
export OMNISCOUT_JSON=1
omniscout search "robotics simulators" --limit 5
omniscout browser navigate https://example.com --session demo
Direct HTTP (no CLI wrapper):
curl -s -X POST http://127.0.0.1:7720/command \
-H 'Content-Type: application/json' \
-d '{"action":"navigate","args":{"url":"https://example.com"},"session":"demo"}'
Architecture
omniscout CLI ──HTTP POST /command──▶ omniscout daemon (127.0.0.1:7720)
│ ├─ Playwright backend
│ ├─ Extension backend (opt-in)
│ └─ Embed service (warm model)
└── Search / Extract / Research engines (local Qdrant + DDG)
Python package layout (for contributors):
cli/omniscout/
app.py # Typer root (binary: omniscout)
commands/ # CLI sub-commands
daemon/ # HTTP server, backends, replay, events
engines/ # browser, search, research, extractor, crawler
store/ # SQLite cache, sessions, workflow, memory
models.py # pydantic JSON contract
On-disk state
| Path | Purpose |
|---|---|
profiles/ |
Persistent Chrome user-data-dirs |
qdrant/ |
Embedded vector index |
models/sentence-transformers/ |
Prefetched embedding model |
memory.sqlite |
Browser memory (visits + notes) |
sessions.sqlite |
Long-lived browser session registry |
cache/pages/ |
Content-hashed HTML cache |
daemon/ |
PID, port, logs, action history, session restore |
Default locations:
- macOS —
~/Library/Application Support/omniscout/ - Linux —
~/.local/share/omniscout/
Override with OMNISCOUT_DATA_DIR, OMNISCOUT_CONFIG_DIR, OMNISCOUT_CACHE_DIR.
Legacy HARNESS_* names are still accepted.
Configuration
config.toml (in config dir):
default_source = "ddg"
search_limit = 10
research_results = 8
request_throttle_seconds = 1.0
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
embedding_local_only = true
browser = "chrome" # chrome | edge | brave | vivaldi | opera | arc | dia | thorium | chromium | custom
# browser_executable = "/path/to/binary" # optional override or required for custom
summary_sentences = 6
Or use the settings command:
omniscout settings browsers
omniscout settings set browser brave
omniscout settings set browser custom --executable /path/to/chromium
omniscout settings show
Supported browser ids: chrome, edge, brave, vivaldi, opera, arc,
dia, thorium, chromium, custom. Legacy browser_channel in
config.toml is still honored.
Environment variables
| Variable | Purpose |
|---|---|
OMNISCOUT_JSON=1 |
Force JSON output on every command |
OMNISCOUT_EMBED_DAEMON=1 |
Route embeds through daemon (default on) |
OMNISCOUT_DAEMON_AUTO_START=0 |
Don't auto-start daemon |
OMNISCOUT_DAEMON_PORT |
Daemon port (default 7720) |
OMNISCOUT_DATA_DIR |
Override data directory |
OMNISCOUT_BROWSER |
Browser id (same as browser in config.toml) |
OMNISCOUT_EMBED_LOCAL_ONLY=0 |
Allow runtime Hugging Face fetches |
TWOCAPTCHA_API_KEY |
CAPTCHA solver API key |
Legacy HARNESS_* equivalents work for all of the above.
Why your own browser?
Using your installed Chromium browser (Chrome, Edge, Brave, etc.) gives you real cookies, login state, extensions, and the same fingerprint as daily browsing — without a separate ~190MB Chromium download. OmniScout falls back to other installed Chromium builds automatically, then to Playwright's bundled Chromium when nothing else is available.
License
Modified MIT — see LICENSE. Products built on OmniScout must prominently display Powered by OmniScout on the user interface.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omniscout-0.2.8.tar.gz.
File metadata
- Download URL: omniscout-0.2.8.tar.gz
- Upload date:
- Size: 160.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3669bacc2dd25bbccb669c3a17a4e32a3df25cadc04311ae29a7103764ff43a6
|
|
| MD5 |
14972b55caeaed0bc67acf9e6c5b5feb
|
|
| BLAKE2b-256 |
4df5173667c86d8843015ed7cbb3bff6ab8b67a34f05d3ff8c75b7a7677111f9
|
Provenance
The following attestation bundles were made for omniscout-0.2.8.tar.gz:
Publisher:
pypi-publish.yml on sriramramnath/omniscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omniscout-0.2.8.tar.gz -
Subject digest:
3669bacc2dd25bbccb669c3a17a4e32a3df25cadc04311ae29a7103764ff43a6 - Sigstore transparency entry: 1767945727
- Sigstore integration time:
-
Permalink:
sriramramnath/omniscout@2d5f0aa844426f1aff1860a4d3ed75bb7adf5d17 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sriramramnath
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@2d5f0aa844426f1aff1860a4d3ed75bb7adf5d17 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file omniscout-0.2.8-py3-none-any.whl.
File metadata
- Download URL: omniscout-0.2.8-py3-none-any.whl
- Upload date:
- Size: 196.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d92bdd222b5c1580b4238d6faacd5e76144777c7840480a56cb849aa43ede01c
|
|
| MD5 |
34cb6560c016d8007514ec14d2206349
|
|
| BLAKE2b-256 |
555499be9c990224d8cacb149aec9468421a84a8ab3235bb112c13ebdfbaf4b1
|
Provenance
The following attestation bundles were made for omniscout-0.2.8-py3-none-any.whl:
Publisher:
pypi-publish.yml on sriramramnath/omniscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omniscout-0.2.8-py3-none-any.whl -
Subject digest:
d92bdd222b5c1580b4238d6faacd5e76144777c7840480a56cb849aa43ede01c - Sigstore transparency entry: 1767946186
- Sigstore integration time:
-
Permalink:
sriramramnath/omniscout@2d5f0aa844426f1aff1860a4d3ed75bb7adf5d17 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sriramramnath
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@2d5f0aa844426f1aff1860a4d3ed75bb7adf5d17 -
Trigger Event:
workflow_dispatch
-
Statement type: