Reconnaissance tool for production scrapers — captures real browser traffic, validates through real proxies, returns a verified scraping plan with runnable starter code.

Project description

browser-recon

Reconnaissance for production scrapers. Launches Chrome on your machine, captures what the browser actually sends, then returns a verified scraping plan: which HTTP library to use, which headers are required, how to handle cookies, which proxy tier, the safe rate-limit — plus a runnable Python starter script.

All processing runs on the browser-recon server. The CLI is a thin client.

Install

pipx install browser-recon

Use

recon login                          # one-time — paste API key from your dashboard
recon scan https://walmart.com       # interactive: launches Chrome, captures, returns report URL

The CLI launches Chrome. You browse the target site for a couple of minutes — click on what you care about, navigate to product pages, run a search, view reviews, whatever data you want to scrape. Press Ctrl+C. The CLI uploads the captured session to the server, shows live progress as the server processes it, then prints the report URL.

What you get back

A single HTML report containing:

Detection — which anti-bot vendors protect the target (Cloudflare, Akamai Bot Manager, PerimeterX, DataDome, Imperva, …), with severity tier.
Scraping plan — which captured endpoints carry the data you want vs which are session prerequisites vs which are noise.
Validation — which HTTP library × proxy tier combination actually works against the live site, measured through real test requests (not inferred from priors).
Headers + cookies — the minimum required set, plus cookie warmup instructions if the anti-bot needs a real-browser session first.
Rate-limit — measured safe delay between requests.
Starter code — a runnable Python script using the recommended library, headers, cookies, and delay.

What this is not

Not a scraper. browser-recon produces the plan for a scraper. You (or your AI assistant) write the scraper using that plan.

Why measurement beats guessing

Most scrapers fail in production because the developer guessed wrong about three things:

Which anti-bot system is in front of the target
Which headers the request actually needs
Whether their IP needs to look residential

browser-recon measures all three by firing real HTTP test requests through real proxies and reporting which combination succeeded. The final recommendation is grounded in what worked, not in what the LLM expected to work.

Architecture

The CLI ships only the non-proprietary glue: Chrome launching, network capture, authentication, upload, and live-progress polling. Roughly 130 KB installed. No detection rules, no validation logic, no LLM prompts, no scoring heuristics live on your machine — everything proprietary runs on the browser-recon server.

The server handles: anti-bot fingerprinting, endpoint inventory analysis, intent-based endpoint classification, proxy-based active validation, secret scrubbing, recommendation synthesis, auxiliary notes and difficulty drivers, and report rendering.

You never need proxy credentials in your shell. The operator's proxy provider account is server-side only.

Status

Active development. v0.3.x is the thin-client architecture (server-side pipeline, animated CLI progress, OIDC-trusted PyPI publishing).

Contributing

Development setup, conventions, and test-suite instructions live in CONTRIBUTING.md.

License

See LICENSE.

Project details

Release history Release notifications | RSS feed

0.3.9

May 20, 2026

0.3.8

May 18, 2026

0.3.7

May 18, 2026

0.3.6

May 14, 2026

0.3.5

May 13, 2026

0.3.4

May 13, 2026

This version

0.3.3

May 13, 2026

0.3.2

May 13, 2026

0.3.1

May 13, 2026

0.3.0

May 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

browser_recon-0.3.3.tar.gz (19.3 MB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

browser_recon-0.3.3-py3-none-any.whl (131.1 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file browser_recon-0.3.3.tar.gz.

File metadata

Download URL: browser_recon-0.3.3.tar.gz
Upload date: May 13, 2026
Size: 19.3 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for browser_recon-0.3.3.tar.gz
Algorithm	Hash digest
SHA256	`d510d6cfe4fdcfd885f407fb5f93f8a839128630be33826c8b70d0eedbbc4080`
MD5	`b4a8da38191def730ff74cbb6a0b7ed1`
BLAKE2b-256	`5926f0469ef7f3b877491cf2a7f2cb7ed5298c100503e5981f6026f02826c9ca`

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_recon-0.3.3.tar.gz:

Publisher: release.yml on lazy-coder-codes/browser-recon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: browser_recon-0.3.3.tar.gz
- Subject digest: d510d6cfe4fdcfd885f407fb5f93f8a839128630be33826c8b70d0eedbbc4080
- Sigstore transparency entry: 1526457358
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: lazy-coder-codes/browser-recon@6d3021cf04d060b91fe11dcfbe6cc14830dbab79
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/lazy-coder-codes
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@6d3021cf04d060b91fe11dcfbe6cc14830dbab79
- Trigger Event: push

File details

Details for the file browser_recon-0.3.3-py3-none-any.whl.

File metadata

Download URL: browser_recon-0.3.3-py3-none-any.whl
Upload date: May 13, 2026
Size: 131.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for browser_recon-0.3.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b7e1b988d8cd93c76ee56a1ded2eb2365c32284c12fbb96755f232d1ac94180f`
MD5	`4c437229e2493935b71ceaa4bd2364e4`
BLAKE2b-256	`cb16997aed505bfa3872a85431f4695aa2d3f652477c5584331168471c5db8d9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for browser_recon-0.3.3-py3-none-any.whl:

Publisher: release.yml on lazy-coder-codes/browser-recon

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: browser_recon-0.3.3-py3-none-any.whl
- Subject digest: b7e1b988d8cd93c76ee56a1ded2eb2365c32284c12fbb96755f232d1ac94180f
- Sigstore transparency entry: 1526457415
- Sigstore integration time: May 13, 2026
Source repository:
- Permalink: lazy-coder-codes/browser-recon@6d3021cf04d060b91fe11dcfbe6cc14830dbab79
- Branch / Tag: refs/tags/v0.3.3
- Owner: https://github.com/lazy-coder-codes
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@6d3021cf04d060b91fe11dcfbe6cc14830dbab79
- Trigger Event: push

browser-recon 0.3.3

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

browser-recon

Install

Use

What you get back

What this is not

Why measurement beats guessing

Architecture

Status

Contributing

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance