Skip to main content

Turn any company or organization URL into a strategic intelligence brief. Adaptive scraping + AI-powered research and synthesis.

Project description

Primr

CI License: MIT Python 3.11+

Turn any company or organization URL into deep strategic analysis that gets a consultant maximally up to speed.

Primr extracts primary-source data from company and organization websites using adaptive, org-aware scraping that handles modern site architectures, then synthesizes external research into long-form strategic analysis using AI-powered research and synthesis (Grok 4.3 hybrid by default, or Gemini Deep Research via --premium).

Runs as a CLI, an MCP server, an OpenClaw integration, and a Claude Skill.

primr "ExampleCo" https://example.co

About 35-50 minutes later: a deep strategic analysis covering competitive positioning, technology stack, strategic initiatives, likely constraints, and consultant-grade hypotheses, with dense references consolidated at the end. ~$0.60 in API costs.

Why This Exists

Company research is tedious. You visit the website, click around, search the company, read articles, synthesize it all, write it up. That process easily takes 1-2 hours per company and the output is usually unstructured notes. Primr replaces that entire workflow with a single command.

What Makes It Different

  • DNS intelligence pre-flight: Automatic domain reconnaissance detects cloud platforms, SaaS services, email security, and identity providers from DNS records — zero API keys, 2-3 seconds. Strategies are grounded in real tech stack data.
  • Hiring-signal gathering: After the main scrape, Primr discovers open job postings (Greenhouse, Lever, Ashby, SmartRecruiters board APIs first; HTML careers-page fallback if every ATS misses), LLM-triages the most signal-rich roles, and extracts tech-stack frequency, strategic initiatives, culture cues, and notable absences. Job posts are often the most honest statement of what a company is actually building right now — they feed every downstream phase from gap analysis to final strategy. Skip with PRIMR_SKIP_HIRING_SIGNALS=1.
  • Adaptive scraping: 8 retrieval methods from browser rendering to TLS fingerprinting to screenshot+vision extraction, with per-host optimization. Starts with full browser rendering (what works on 95%+ of modern sites) and falls back through increasingly specialized methods.
  • Org-aware site selection: Link discovery and prioritization now adapt for commercial companies, government sites, nonprofits, education, and healthcare organizations instead of assuming every site looks like a SaaS company.
  • Fail-fast scrape quality gate: Full/scrape modes now abort when site extraction is too thin, while still preserving short structured pages like contact, leadership, and org-chart references when they carry useful signal (override with --skip-scrape-validation).
  • Autonomous external research: Gemini Deep Research for comprehensive analysis, Grok 4.3 for fast turnaround — both plan queries, follow leads, cross-validate sources, and synthesize findings.
  • Cost controls built in: --dry-run estimates (including recovery table and stage classifications), usage tracking, and governance hooks for budget limits.
  • Agent-native interfaces: CLI, MCP server, OpenClaw integration, and Claude Skills, all first-class.

Artifact Model

Primr treats research artifacts and shipping artifacts as different classes of output. Intermediate research steps such as scrape summaries, gap-analysis notes, source inventories, contradiction findings, and section briefs optimize for consistency, provenance, and parseability. Their formatting matters far less than whether they are complete and structured enough to feed later stages reliably.

Final reports and strategy documents are different. Those artifacts must ship cleanly as Markdown, TXT, DOCX, and eventually PDF, so Primr treats them as a stricter output contract with deterministic cleanup, citation normalization, validation gates, and renderer hardening.

What is already in place:

  • Final-document canonicalization before shipping so report/strategy artifacts are normalized into a stable shape before MD/TXT/DOCX rendering
  • Typed generated-section normalization at the section-writing seam, including validation-line cleanup, embedded reference stripping, and citation extraction
  • Mixed-format parsing resilience so section batches can recover cleanly even if the model blends XML-style section envelopes with legacy ## headings
  • Cleaner artifact validation for rendered DOCX outputs, including reduced false positives from literal # content inside tables

Near-term work remains focused on pushing more structure upstream into the long-form writing steps, reducing arbitrary markdown repair before shipping, and strengthening artifact gates against real-world failed artifacts.

Modes

Mode What it does Time Cost
Default Grok 4.3 hybrid + AI Strategy (recon auto-detects platform) ~35-50 min ~$0.60
--platform ms Microsoft Azure + NVIDIA private cloud strategy ~45-60 min ~$0.65
Default + multi-platform Add --platform aws azure ~45-60 min ~$0.65
Default + strategy type Add --strategy-type customer_experience ~35-50 min ~$0.60
--grok-tier fast Grok 4.1 everywhere (cheaper, slightly lower quality) ~30-45 min ~$0.47
--grok-tier max Grok 4.3 everywhere (deeper reasoning across writing too) ~35-50 min ~$2.50
--premium Gemini + Deep Research + AI Strategy 50-75 min ~$5
--premium --platform ms Premium + Microsoft/NVIDIA 75-120 min $6-9
--premium --lite Pro model instead of DR for AI Strategy 50-80 min ~$4
--mode scrape Crawl site + extract insights only 5-10 min $0.10
--mode deep Gemini Deep Research on external sources only 10-15 min $2.50
primr recon DNS intelligence only (no API keys needed) 2-3 sec $0.00

The default primr command auto-detects: when XAI_API_KEY is set, it uses the Grok 4.3 hybrid pipeline (4.3 for reasoning-heavy stages, 4.1-fast for bulk writing) at ~$0.60/run. The standard pipeline includes research deepening, cross-validation, trust-polish, citation normalization, and constrained-evidence reasoning. Strategy types (ai, customer_experience, modern_security_compliance, data_fabric_strategy) are YAML-defined and auto-discovered — run primr --list-strategies for details. DDG searches are free. Use --dry-run for accurate cost estimates.

For model evaluation and quality comparison, see Evaluation Guide.

Quick Start

pip install primr
primr init                      # Guided keys + browser setup
primr doctor                    # Verify everything works
primr "ExampleCo" https://example.co

From a source checkout:

git clone https://github.com/blisspixel/primr.git
cd primr
py -3.13 setup_env.py           # Windows
# or: python3.13 setup_env.py   # macOS/Linux
primr init
primr doctor                     # Verify everything works
primr "ExampleCo" https://example.co  # Run your first research

Requires Python 3.11+. On Windows, prefer py -3.13 instead of bare python if your default interpreter is older. setup_env.py installs or upgrades the local editable package to the current repo version, installs dependencies, and creates .env. primr init walks through user-level API keys, browser dependencies, and verification. Local .env files and shell environment variables still work and can override user-level keys. Set XAI_API_KEY for the standard Grok pipeline (it covers analysis, writing, and utility-tier calls like scraping summaries and link selection). Set GEMINI_API_KEY only if you also want --premium mode or you do not have an xAI key. Web search uses DuckDuckGo (no key needed).

Platform Support

  • Windows
  • macOS
  • Linux
# Standard run (auto-detects platform from DNS)
primr "Company" https://company.com

# Microsoft Azure + NVIDIA private cloud strategy
primr "Company" https://company.com --platform ms

# Research modes
primr "Company" https://company.com --mode scrape              # Site corpus only
primr "Company" https://company.com --mode deep                # External research only
primr "Company" https://company.com --dry-run                  # Cost estimate first

# Multi-platform and strategy types
primr "Company" https://company.com --platform aws azure       # Multi-platform AI strategy
primr "Company" https://company.com --strategy-type customer_experience  # CX strategy
primr --list-strategies                                        # See all strategy types

# Premium (Gemini + Deep Research)
primr "Company" https://company.com --premium                  # ~$5, maximum depth
primr "Company" https://company.com --premium --lite           # Cheaper premium strategy

# DNS intelligence (standalone, no API keys needed)
primr recon acme.com                                           # DNS intelligence lookup
primr recon acme.com --json                                    # Structured JSON output

When --platform is omitted, Primr runs recon first and uses strong infrastructure signals (for example Azure DNS/App Service/CDN, AWS Route53/CloudFront, or GCP DNS) to choose the AI strategy platform. If multiple strong platforms are detected, it generates one strategy per platform. Productivity, certificate, and email-only signals do not count as primary-cloud proof. If recon is unclear or skipped, the default strategy posture is Microsoft Azure plus private cloud/NVIDIA (azure private).

Use --output-dir to send customer-facing deliverables to a specific client folder:

primr "Company" https://company.com --output-dir "C:\Clients\Company"

With a custom output directory, Primr keeps that folder clean: Markdown and DOCX deliverables are written there, while TXT mirrors and validation diagnostics stay in the run's working/<company>/<timestamp>/_diagnostics/ folder. The default output/ folder still includes TXT mirrors for backward compatibility.

For batch processing, see Batch Guide. For crash recovery and resume, see Recovery Guide. For post-generation quality improvement, see Improve Guide.

What a run looks like

Grok 4.3 hybrid · recon auto-detected Azure

▸ PHASE 0/6 · Recon
✓ 14 services, 8 insights, platform: azure (2s)

▸ PHASE 1/6 · Data Collection
✓ 251 links → 50 selected
✓ 48/50 pages scraped (6m 10s)
✓ 31 external sources (8m 22s)

▸ PHASE 2/6 · Research Deepening
✓ 8 gaps identified, 12 additional sources

▸ PHASE 3/6 · Analysis
✓ Structured workbook built

▸ PHASE 4/6 · Report Writing
  Part 1/5: 7 sections in parallel
  Part 2/5: 3 sections in parallel
  Part 4/5: 7 sections in parallel
✓ 23 sections, 21,500 words

▸ PHASE 5/6 · Cross-Validation
✓ 3 contradictions resolved
  Trust: PASS · cites 12/12 · appendix clean

▸ PHASE 6/6 · AI Strategy (Azure)
✓ Strategy generated

✓ Complete in 38m
  output/ExampleCo_Strategic_Overview_04-10-2026.docx

PASS | 23 chapters | 48 citations | ~$0.74

What the output looks like

From the executive summary of a sample report:

Northwind Haulage Corp is a mid-market logistics optimization vendor ($180-220M ARR, estimated) that sells route planning and fleet analytics software to regional shipping companies. The company occupies a defensible but narrowing niche: optimizing last-mile delivery for carriers still running legacy dispatch systems.

Key insights:

  • Northwind's customer concentration is high. Cross-referencing case studies, press releases, and conference presentations, roughly 40% of referenced deployments involve just 3 carrier networks. Loss of any one would be material. [Confidence: Inferred]
  • The company has no disclosed AI strategy, but 4 of their last 7 engineering hires have ML/optimization backgrounds. Combined with a patent filing for "autonomous route replanning under disruption," this suggests an unannounced product line. [Confidence: Inferred]
  • Pricing has shifted from perpetual licenses to consumption-based billing (per-shipment), visible in public procurement portal RFP responses. [Confidence: Reported]

Reports include 23 structured sections, SWOT analysis, competitive landscape, discovery questions, and inline confidence levels on every non-obvious claim.

Under the Hood

Primr uses an 8-tier browser-first retrieval engine with sticky tier memory, circuit breakers, and cookie handoff. Models range from Grok 4.1 ($0.20/$0.50 per 1M tokens) through Grok 4.3 ($1.25/$2.50 with $0.20 cached input) to Gemini Deep Research (~$2.50/task). The agentic architecture includes hypothesis tracking, subagents for each pipeline stage, governance hooks, and persistent research memory.

For full architecture details, model pricing, and the retrieval tier breakdown, see System Design.

Configuration

# Recommended first-run setup
primr init

# Writes to the per-user Primr config file
primr keys set gemini           # https://aistudio.google.com/apikey
primr keys set xai              # https://console.x.ai/
primr keys list
primr keys path

# Diagnose, then launch guided fixes if needed
primr doctor --fix

# Local .env files and shell env vars are also supported:
XAI_API_KEY=          # Grok standard pipeline (analysis + writing + utility tier)
GEMINI_API_KEY=       # Required only for --premium mode (and for utility tier when no XAI_API_KEY)

Web search uses DuckDuckGo by default, no key needed.

Full config reference | API key setup

Use primr from your AI tool

primr ships with an AGENTS.md (auto-loaded by Kiro, Codex, Aider, Jules), a Claude Code plugin under claude-code/, and per-host MCP snippets under clients/ for Cursor, Windsurf, VS Code + Copilot, and Claude Desktop.

Claude Code (one-command install):

/plugin marketplace add blisspixel/primr
/plugin install primr@blisspixel-primr

That registers both the MCP server (primr mcp, exposed as mcp__primr__* tools) and the skill (cost gate, async lifecycle, mode selection — loaded on-demand based on its description).

Skill-only install (no plugin): paste this to Claude Code or any agent that can fetch and write files:

Fetch https://raw.githubusercontent.com/blisspixel/primr/main/claude-code/skills/primr/SKILL.md and save it to ~/.claude/skills/primr/SKILL.md. Fetch the four files under https://raw.githubusercontent.com/blisspixel/primr/main/claude-code/skills/primr/references/ and save them under ~/.claude/skills/primr/references/. Then run pip install primr && primr init.

Other hosts (Cursor / Windsurf / Kiro / VS Code): see clients/README.md — copy-pasteable MCP config plus instructions for placing the skill or referencing AGENTS.md from the host's rules system.

Agent Integration (advanced)

MCP server — Claude Code, Cursor, Windsurf, Claude Desktop, and any MCP-compatible client:

primr mcp                      # stdio transport (default — what hosts launch)
primr mcp --http --port 8000   # HTTP with JWT auth
primr-mcp --stdio              # legacy entry point, still supported

A2A Protocol — Agent-to-Agent communication with any A2A-compatible agent:

pip install primr[a2a]                     # install optional A2A support
primr-a2a --no-auth                        # standalone A2A server on port 9000
primr-mcp --http --a2a                     # co-hosted with MCP server
curl localhost:9000/.well-known/agent.json  # discover agent capabilities
OpenClaw - Packaged skills, governed workflows, and sandbox config
# openclaw/openclaw.json wires Primr MCP into OpenClaw
# Skills: primr-research, primr-strategy, primr-qa
# Workflows: research-pipeline, strategy-pipeline

The packaged workflows estimate cost, require approval, and propagate approved cost caps into spend calls. See docs/OPENCLAW.md for setup and troubleshooting.

Claude Skills - MCP-first skill packages
skills/
├── company-research/SKILL.md
├── hypothesis-tracking/SKILL.md
├── qa-iteration/SKILL.md
└── scrape-strategy/SKILL.md

These skills are thin intent routers over Primr MCP rather than separate product definitions. Generic MCP clients can also use primr://agent/governance, primr://research/next-actions, and the governed_execution prompt to follow the same estimate/approval/monitor pattern.

MCP docs | A2A protocol | OpenClaw config | OpenClaw guide

Cloud Deployment

Primr is CLI-first, local-first. Cloud deployment is optional for teams needing shared access or always-on availability.

Tier What it is Idle cost
Solo (default) CLI on your machine $0
Team Azure Container Apps, scale-to-zero < $5/month
Organization Entra ID, budget tracking, observability, M365 Agent Store < $15/month

See the Deployment Guide or Azure Quickstart.

Development

python -m pytest tests/ -x --tb=short       # Run tests
ruff check .                                 # Lint
mypy src/primr --ignore-missing-imports     # Type check

5,700+ tests including property-based testing (Hypothesis), full ruff and mypy compliance, and OpenTelemetry tracing. CI runs lint, type check, and tests on every push.

Learn More

Topic Guide
Batch processing Batch Guide
Model evaluation Evaluation Guide
Crash recovery Recovery Guide
Output improvement Improve Guide
Configuration Full Config Reference
Architecture System Design
Adding a new model Model Onboarding Playbook
Cloud deployment Deployment Guide
Agent integration MCP & A2A API
API key setup API Keys
Azure quickstart Azure Quickstart
OpenClaw Setup & Troubleshooting
Security ops Security Operations
Contributing Contribution Guidelines
Vulnerability reporting Security
Roadmap What's Planned

About This Project

Primr is a nights-and-weekends project by a solo developer. The time-to-insight ratio for company research was terrible, and most of the work was mechanical. That's exactly what AI should be doing. So I built the tool I wanted.

It's not backed by a company or a team. It's an independent project built for personal use.

Disclaimer

Primr is a research tool. You are responsible for:

  • Web content: Primr retrieves publicly available web content, similar to a browser or search engine crawler. It does not bypass authentication, access paywalled content, or exploit vulnerabilities. However, some websites restrict automated access in their terms of service - it is your responsibility to check before running Primr against any site.
  • Accuracy: AI-generated content may contain errors, hallucinations, or outdated information. Verify findings before acting on them.
  • Costs: API calls to AI services (Gemini, Grok) incur real charges. Use --dry-run to estimate costs before running.
  • Use case: This tool is intended for legitimate research purposes. Do not use it to violate any website's terms of service or any applicable law.

This software is provided as-is by a solo developer. The author is not liable for how you use this software, the accuracy of its outputs, or any consequences of its use.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

primr-1.22.0.tar.gz (1.5 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

primr-1.22.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file primr-1.22.0.tar.gz.

File metadata

  • Download URL: primr-1.22.0.tar.gz
  • Upload date:
  • Size: 1.5 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for primr-1.22.0.tar.gz
Algorithm Hash digest
SHA256 c7ab77a35a4d67ce044981a67a7fd22a6d053758b5917b52a1229666d3dd5127
MD5 ec94e521656dc0f0ae6ebaeae25ae54f
BLAKE2b-256 bba26657e63d947606280c77adbc47972c6504230afd0c75c590b49ae5a28922

See more details on using hashes here.

Provenance

The following attestation bundles were made for primr-1.22.0.tar.gz:

Publisher: release.yml on blisspixel/primr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file primr-1.22.0-py3-none-any.whl.

File metadata

  • Download URL: primr-1.22.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for primr-1.22.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0b68b3eaee452aaecb9de95e6a7da6b4326c59732abefd5168fdfe487bbf58f6
MD5 d1e99935c00ab83bfadbdcf3d1549c7f
BLAKE2b-256 da7a3f34ce5c373529062a9f0c3c4d8f5094f27eae370ef9c23023021187b85f

See more details on using hashes here.

Provenance

The following attestation bundles were made for primr-1.22.0-py3-none-any.whl:

Publisher: release.yml on blisspixel/primr

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page