Local-only role discovery for finding high-fit startup opportunities.
Project description
PathScout
PathScout is a local-only role discovery CLI for finding high-fit startup opportunities before they become obvious job posts.
It fetches broad signals, scores them against a personal fit profile, stores deduped observations in SQLite, and emits a canonical JSON artifact plus a readable Markdown digest.
What PathScout Is
- A local-only CLI for monitoring companies, careers pages, RSS feeds, portfolio lists, and manual notes.
- A fit-profile engine for surfacing target roles, hidden-search hypotheses, and weaker watch signals.
- An explainable findings scanner: every surfaced item includes score, tier, reasons, flags, source metadata, and suppression state.
What PathScout Is Not
- It is not a hosted marketplace.
- It is not a recruiting CRM.
- It is not a general-purpose job board scraper.
- It does not provide hosted storage, sync, or remote persistence.
Install
From GitHub:
pipx install git+https://github.com/ckoglmeier/pathscout.git
From a local checkout:
pipx install .
For development:
python3 -m pathscout doctor
python3 -m pathscout run --dry-run --format both
Quick Start
pathscout start
pathscout next
pathscout init
pathscout setup
pathscout doctor
pathscout run --format both
pathscout start is a read-only startup checklist. It shows what exists, what is missing, and the next recommended command without creating or editing files.
pathscout next prints only the next recommended action. /next is also accepted as an alias.
pathscout setup is an interactive guided setup flow. It walks through environment, role/function, locations, avoid terms, background, proof points, constraints, and network context in order, saving answers into local JSON files as it goes.
During init, PathScout asks two onboarding questions in this order:
- What is the right environment for you?
- What is the right role for you?
For scripted setup, pass answers directly:
pathscout init \
--environment "Remote AI startups" \
--role "Founding Product Lead"
Use --no-input to create default sample config without prompts.
Outputs:
data/pathscout.sqlite: local state and dedupe history.outputs/latest.json: canonical machine-readable findings artifact.outputs/latest.md: human-readable digest rendered from the JSON findings.outputs/packages/: optional portable opportunity packages created from findings.config/profile.json: personal fit profile.config/background.sample.json: tracked example candidate context.config/background.local.json: private candidate context and proof points.config/sources.json: source adapter configuration.config/watchlist.json: curated company list.config/suppressions.json: structured ignored findings.
Configuration
PathScout uses schema-versioned JSON files.
config/profile.json is the personal fit model. It contains target roles, stages, domains, excluded domains, location preferences, travel constraints, authority terms, and scoring thresholds.
config/sources.json describes inputs. Each source uses this adapter contract:
{
"id": "watchlist_careers",
"type": "watchlist_careers",
"name": "Watchlist careers pages",
"enabled": true,
"config": {
"path": "config/watchlist.json"
}
}
id is stable and scriptable. name is display-only. type selects the adapter. config is adapter-specific.
Network resilience
The watchlist_careers, web_page, and rss adapters share a single network chokepoint (pathscout.fetchers.http_get) that:
- Retries transient network failures (timeouts, connection errors) with jittered exponential backoff before giving up.
- Honors
ETag/Last-Modifiedresponse headers via an injectableResponseCache, reusing the cached body on a304 Not Modifiedresponse instead of re-parsing a fresh one. - Logs fetch failures through the standard
loggingmodule (logging.getLogger("pathscout.fetchers")) instead of swallowing them silently — attach a handler to observe what failed and why.
watchlist_careers additionally supports a per-host rate_limit_seconds config field, enforcing a minimum delay between requests to the same host (independent of the source's overall max_elapsed_seconds run budget):
{
"id": "watchlist_careers",
"type": "watchlist_careers",
"name": "Watchlist careers pages",
"enabled": true,
"config": {
"path": "config/watchlist.json",
"timeout_seconds": 3,
"candidate_paths": ["careers", "jobs"],
"max_elapsed_seconds": 300,
"rate_limit_seconds": 1
}
}
ResponseCache and the per-host rate limiter are constructor-injectable (not global state), so a long-lived caller — e.g. a scheduled worker running fetches for many users — can supply persistent implementations instead of the default in-memory, one-per-run behavior the CLI uses.
config/suppressions.json stores structured ignores:
{
"schema_version": 1,
"suppressions": [
{
"id": "finding-content-hash",
"scope": "finding",
"reason": "Not a fit",
"expires_at": "2026-12-31",
"created_at": "2026-06-29"
}
]
}
Suppressions affect output visibility. They do not delete observations from SQLite.
Source Types
The v0.2 runner supports standard-library fetches for:
manual: config-entered notes for companies or opportunities you want tracked.watchlist: turns every active watchlist company into a hidden-search observation.watchlist_careers: probes active watchlist companies' careers pages for posted role evidence.portfolio: turns companies fromconfig/portfolio.jsoninto relationship-context observations.web_page: fetches a single web page.rss: fetches an RSS or Atom feed.
radar_portfolio remains as a deprecated alias for one release. Use portfolio for new config.
Commands
pathscout start
pathscout next
pathscout init
pathscout setup
pathscout doctor
pathscout watchlist
pathscout portfolio
pathscout review
pathscout explain <finding-id>
pathscout notes <finding-id> --add "Question to verify before outreach"
pathscout thesis <finding-id>
pathscout package <finding-id>
pathscout suppress <finding-id> --reason "Not a fit"
pathscout run --format json
pathscout run --format markdown
pathscout run --format both
Useful paths can be overridden:
pathscout run \
--profile config/profile.json \
--sources config/sources.json \
--watchlist config/watchlist.json \
--suppressions config/suppressions.json \
--db data/pathscout.sqlite \
--json-out outputs/latest.json \
--out outputs/latest.md
Digest Tiers
Act Now: explicit target role or recruiter-visible mandate with strong fit signals.Hidden Search Hypothesis: no role posted, but company signals suggest a likely hiring need.Watch Signal: weaker signal, lower-level posting, or incomplete evidence.Filtered: captured for history but excluded from the main digest.
Review And Suppress
Use review to scan findings from the latest JSON artifact without opening the file:
pathscout review --limit 10
pathscout review --tier "Act Now"
Use explain to inspect why a finding surfaced:
pathscout explain <finding-id>
Use notes to keep local judgment attached to a finding or company:
pathscout notes <finding-id> --add "Ask a former employee whether this team is still founder-led"
pathscout notes --company "Northstar Robotics"
Use thesis to generate a local role-thesis package from a finding. Copy config/background.sample.json to config/background.local.json first if you want the thesis to include private candidate context:
pathscout thesis <finding-id>
Thesis packages are written to outputs/theses/ and are generated from the same JSON finding objects used by review and Markdown digests. They include the company moment, problem map, proposed function, fit argument, 90-180 day wedge, notes, and evidence gaps. They are thinking artifacts, not generated job descriptions or send-ready outreach.
Use suppress to hide a finding from later Markdown digests while keeping the raw observation in SQLite and the finding marked in JSON:
pathscout suppress <finding-id> --reason "Not a fit" --expires 2026-12-31
Careers pages are parsed into separate role findings when PathScout can identify role-title rows. If a page does not expose clear role titles, PathScout falls back to one page-level finding.
Package Exports
Use package to create a portable, human-readable and agent-readable opportunity package from a finding in outputs/latest.json:
pathscout package <finding-id>
Each package includes a manifest, a human Markdown brief, agent instructions, and canonical JSON data under outputs/packages/. See docs/artifacts.md for the artifact contract.
config/background.local.json, legacy config/background.json, data/notes.json, outputs/theses/, and outputs/packages/ are ignored by default because they may contain private candidate context.
See DATA_CONTRACT.md and docs/source_of_truth.md for the local-only storage boundary and agent-readable artifact contract. Network source fetches collect evidence for local runs; they are not hosted storage or sync.
Design Borrowed From
PathScout follows scanner-style findings: stable IDs, evidence, severity-like tiers, reasons, flags, and suppressions.
The config split borrows from dbt-style separation of personal profile from project config. Source IDs follow the pre-commit convention: stable machine IDs plus human names. Suppressions borrow from security scanners: structured ignores with reasons and optional expiration dates.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pathscout-0.4.0.tar.gz.
File metadata
- Download URL: pathscout-0.4.0.tar.gz
- Upload date:
- Size: 50.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a55dd9f5c01384e1553977259b5b7c9d1315a103b735ee5dd816448d0e601411
|
|
| MD5 |
90649f927d536893394094160964c2f7
|
|
| BLAKE2b-256 |
bda7a6810c2da6a1b7670c60113675a2caca12a14da0ec308d984866b3ae8577
|
Provenance
The following attestation bundles were made for pathscout-0.4.0.tar.gz:
Publisher:
release.yml on ckoglmeier/pathscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pathscout-0.4.0.tar.gz -
Subject digest:
a55dd9f5c01384e1553977259b5b7c9d1315a103b735ee5dd816448d0e601411 - Sigstore transparency entry: 2063918651
- Sigstore integration time:
-
Permalink:
ckoglmeier/pathscout@1b3b7ac273bdac36dd3e0930715db9988beac178 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ckoglmeier
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1b3b7ac273bdac36dd3e0930715db9988beac178 -
Trigger Event:
release
-
Statement type:
File details
Details for the file pathscout-0.4.0-py3-none-any.whl.
File metadata
- Download URL: pathscout-0.4.0-py3-none-any.whl
- Upload date:
- Size: 41.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
753ff9af206c2aeb77c2587b0c58dcdaff0efa3071fa20f5dddae17b4b209e6b
|
|
| MD5 |
51a94aceeaf309fd53f82a71f19ec542
|
|
| BLAKE2b-256 |
ef03a26d2b6f9d5657a2f133e334e8860db80cf5a6f392aaec58583302c28ac2
|
Provenance
The following attestation bundles were made for pathscout-0.4.0-py3-none-any.whl:
Publisher:
release.yml on ckoglmeier/pathscout
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
pathscout-0.4.0-py3-none-any.whl -
Subject digest:
753ff9af206c2aeb77c2587b0c58dcdaff0efa3071fa20f5dddae17b4b209e6b - Sigstore transparency entry: 2063918673
- Sigstore integration time:
-
Permalink:
ckoglmeier/pathscout@1b3b7ac273bdac36dd3e0930715db9988beac178 -
Branch / Tag:
refs/tags/v0.4.0 - Owner: https://github.com/ckoglmeier
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@1b3b7ac273bdac36dd3e0930715db9988beac178 -
Trigger Event:
release
-
Statement type: