OmniScout CLI (harness): local-first browser automation, semantic search, and research for AI agents
Project description
OmniScout CLI
Local-first browser automation, semantic search, and research for AI agents. No cloud APIs, no hosted browser sessions, no MCP yet, no SDK.
The CLI is the interface.
Install
Requires Python 3.11+ and Google Chrome (already installed on most macOS
machines at /Applications/Google Chrome.app).
Recommended: install as a global tool
pip install omniscout # verifies Chrome + prefetches embedding model
After this, scout works from any directory. Edits to source files are
picked up live (editable install).
omniscout remain available as compatibility aliases.
If you don't have Chrome installed, add --bundled to also download
Playwright's bundled Chromium (~190MB).
scout install also prefetches the local sentence-transformers model into
OmniScout's app data directory so later commands do not need to fetch it again.
Use --no-model to skip model prefetch.
Quickstart
# Search the web (DuckDuckGo HTML + local embedding rerank)
scout search "local-first browser agents"
# same command via alias:
scout search "local-first browser agents"
# Extract a URL to clean Markdown
scout extract https://example.com
# Capture a screenshot of a real page using your installed Chrome
scout browser screenshot https://example.com --out page.png
# Run a multi-step research pipeline (search -> crawl -> extract -> rerank -> summarize)
scout research "state of local AI agents in 2026"
# Manage persistent browser profiles (cookies, logins persist across runs)
scout profile create work
scout browser open https://news.ycombinator.com --profile work --headful
# Long-lived browser sessions (other tools can attach via CDP)
scout session start --headful
scout session list
scout session kill --all
JSON output (for agents)
Every command emits structured JSON when invoked with --json (or with
OMNISCOUT_JSON=1 in the environment). Logs always go to stderr; stdout is
reserved for the structured result.
OMNISCOUT_JSON=1 scout search "robotics simulators" --limit 5
Architecture
omniscout/
app.py # Typer root
commands/ # CLI sub-commands (thin)
engines/
browser.py # Playwright + system Chrome
extractor.py # trafilatura + markdownify
crawler.py # async httpx + Chrome fallback
search/
ddg.py # DuckDuckGo HTML
embed.py # sentence-transformers (all-MiniLM-L6-v2)
index.py # embedded Qdrant on-disk
rerank.py # cosine rerank
pipeline.py # ddg | index | hybrid
research.py # full pipeline (search -> crawl -> extract -> rerank -> summarize)
store/
cache.py # SQLite + content-hashed HTML cache
sessions.py # SQLite registry of browser sessions
models.py # pydantic result types (the JSON contract)
On-disk state lives under ~/Library/Application Support/omniscout/ (macOS) /
$XDG_DATA_HOME/omniscout/ (Linux):
| Path | Purpose |
|---|---|
profiles/ |
Persistent Chrome user-data-dirs |
qdrant/ |
Embedded vector index |
sessions.sqlite |
Registry of long-lived browser sessions |
cache/pages/ |
Content-hashed HTML cache used by extract+crawler |
Override via OMNISCOUT_DATA_DIR, OMNISCOUT_CONFIG_DIR, OMNISCOUT_CACHE_DIR,
or settings in ~/Library/Application Support/omniscout/config.toml.
Configuration
config.toml example:
default_source = "ddg" # search source default
search_limit = 10
research_results = 8
request_throttle_seconds = 1.0 # per-host throttle in the crawler
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
embedding_local_only = true # default; never fetch model files at query time
browser_channel = "chrome" # uses installed Google Chrome
# browser_executable = "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
summary_sentences = 6
Set OMNISCOUT_EMBED_LOCAL_ONLY=0 to allow runtime Hugging Face fetches.
\
Why local Chrome?
Using your system Chrome (channel = "chrome") gives you:
- Real cookies, login state, extensions, and font rendering
- No extra ~190MB Chromium download
- The same user-agent fingerprint as your daily browsing
- Cleaner integration with
omniscout session startfor long-lived sessions that other tools can attach to over CDP
If Chrome isn't available, the engine transparently falls back to Playwright's bundled Chromium.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omniscout-0.2.1.tar.gz.
File metadata
- Download URL: omniscout-0.2.1.tar.gz
- Upload date:
- Size: 104.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0f96a43e46c193325d54eba012d16bff97aa8784dbe6d4f12b59e9c82772d5e5
|
|
| MD5 |
0c3094b9477afec3eff64582b5e3da13
|
|
| BLAKE2b-256 |
1f050f827b2d5cf095c4a79d9176162760c2af01cd4d0055edd1a8ec79e07ece
|
Provenance
The following attestation bundles were made for omniscout-0.2.1.tar.gz:
Publisher:
pypi-publish.yml on sriramramnath/omniscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omniscout-0.2.1.tar.gz -
Subject digest:
0f96a43e46c193325d54eba012d16bff97aa8784dbe6d4f12b59e9c82772d5e5 - Sigstore transparency entry: 1674895326
- Sigstore integration time:
-
Permalink:
sriramramnath/omniscout@a7fc244b6cc6a629341a01ac386513c90c806390 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sriramramnath
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@a7fc244b6cc6a629341a01ac386513c90c806390 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file omniscout-0.2.1-py3-none-any.whl.
File metadata
- Download URL: omniscout-0.2.1-py3-none-any.whl
- Upload date:
- Size: 134.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b759a7ccd05535cfa9cd371bb5acef8dbe0329e516e42ac9c86402114a2bbf4b
|
|
| MD5 |
51144715249a1e2a5e868fddac04cd15
|
|
| BLAKE2b-256 |
7690d7aaa51a7140c0e4d27e77cf5473b1294039cc5c75b0a503b08118f50ec2
|
Provenance
The following attestation bundles were made for omniscout-0.2.1-py3-none-any.whl:
Publisher:
pypi-publish.yml on sriramramnath/omniscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omniscout-0.2.1-py3-none-any.whl -
Subject digest:
b759a7ccd05535cfa9cd371bb5acef8dbe0329e516e42ac9c86402114a2bbf4b - Sigstore transparency entry: 1674895333
- Sigstore integration time:
-
Permalink:
sriramramnath/omniscout@a7fc244b6cc6a629341a01ac386513c90c806390 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/sriramramnath
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
pypi-publish.yml@a7fc244b6cc6a629341a01ac386513c90c806390 -
Trigger Event:
workflow_dispatch
-
Statement type: