Skip to main content

Synthetic user research and LLM eval harness — research-grade rigor for developers who can't afford a research team.

Project description

Voice of Agents

PyPI version Python versions CI License: MIT

Beta users lie. Simulate them honestly, then validate the risky findings with the users you still need to talk to.

voa research demo running in a terminal


Why this exists

  • The problem: Pre-PMF user research is too slow and expensive, so founders ship by vibe and learn why users churn only after they've churned.
  • The approach: A 4-stage synthetic research pipeline (subjects → personas → workflows → journey) with mandatory epistemic framing — synthetic data is marked as hypothesis, not finding.
  • The twist: Synthetic personas feed directly into a live browser-based eval harness that navigates your actual product. The loop goes from "I don't know my users" to "here are typed personas failing to adopt my product" in under 15 minutes.

Quick start

pip install voice-of-agents
cp .env.example .env          # then paste your key into .env
voa doctor                    # pre-flight check — recommended first command

60-second demo — zero API key required

voa research demo --offline

Uses a bundled cassette of a real pipeline run. Prints findings to your terminal in under 10 seconds. No Anthropic account needed.

Live demo (requires ANTHROPIC_API_KEY)

voa research demo

Runs a preset research question with 10 subjects. ~$0.30 with Opus, ~$0.02 with Haiku.

Plain-English setup

voa research quickstart        # 3 plain-English questions; no methodology vocabulary
voa research run --model-haiku # low-cost exploration run

One-liner Python API

from voice_of_agents.research import quick_research_sync

result = quick_research_sync(
    what="a coding assistant that helps developers write tests",
    who="senior developers at startups",
    understand="why developers abandon AI coding tools after the first week",
)

print(result.build_this_first)            # "Ship a 'first win in 5 minutes' onboarding flow..."
print(result.validate_with)               # ["Walk me through the last time...", ...]
print(result.personas[0].would_pay_if)    # "It catches one bug I would have shipped..."

Who is this for?

  • Pre-PMF solo founders — you need decisions, not just findings. Start with examples/solo-founder/.
  • Product engineers owning a roadmap — you want a one-liner Python API to seed user-archetype hypotheses. Start with examples/product-engineer/.
  • DX practitioners / platform leads — you want the full research → eval bridge: synthetic personas navigating your actual product. Start with examples/dx-practitioner/.

Proof, not claims

demo/multi-agent-adoption/ is a complete end-to-end run of the pipeline against a real product: research config → decision report → 4 seeded eval personas → browser exploration logs → per-persona evaluations → focus-group analysis → prioritized backlog.

Read it top-to-bottom before running your own. It shows what "good" output looks like, so you can calibrate your expectations before spending API budget.


What this is (and is not)

Voice of Agents is a Python library that simulates user research using the Claude API. It runs a synthetic sampling frame — including adopters, abandoners, skeptics, and critics — and returns decision-oriented output: what to build first, what to validate with real users, and what would make each user type leave.

It is not a replacement for real research. It is a forcing function for better questions.

Every session emits a SYNTHETIC-DATA-NOTICE.md that tells you exactly what to ask real users to validate the highest-risk findings. The output is a map of the hypothesis space, not a conclusions report.

Read the Manifesto for the full worldview.


The research → eval bridge

The workflow that differentiates this library: use synthetic research personas to seed your LLM evaluation pipeline.

# Run research through Stage 2 (personas)
voa research run research-config.yaml

# Convert research personas to eval-ready Persona objects
voa research seed-eval research-sessions/my-research.yaml --output data/personas/

# Run eval with research-grounded personas
voa eval run --all

Research personas have constraint profiles, failure modes, and anti-models of success — exactly the signal you need to write eval rubrics that catch what your real users will complain about.

See docs/BRIDGE-WORKFLOW.md for the full workflow.


Cost transparency

Before any API calls:

voa research run research-config.yaml --dry-run
# Estimated cost: $1.20–$2.10 | Estimated time: 8–15 minutes

voa research run research-config.yaml --dry-run --model-haiku
# Estimated cost: $0.06–$0.11 | Estimated time: 5–8 minutes

CLI reference

Research pipeline

voa research demo [--offline]              # preset demo; --offline uses bundled cassette (no API key)
voa research quickstart                    # 3-question plain-English setup
voa research init [slug]                   # create research-config.yaml interactively
voa research validate-config [config]      # pre-flight check, no API calls
voa research run [config] [--dry-run]      # run pipeline (all stages or one)
  --model-haiku                            # use Haiku (~1/20th the cost)
  --stage [all|product-research|personas|workflows|journey]
  --session path/to/session.yaml           # resume a partial run
voa research status session.yaml           # show completion state
voa research export session.yaml           # emit RESEARCH-SUMMARY.md
voa research seed-eval session.yaml        # convert personas to eval Personas
voa research list-sessions                 # list all sessions

Eval pipeline

voa eval init --target http://localhost:3000 --api http://localhost:8420
voa eval run [--all]
voa eval status
voa eval backlog
voa eval capabilities
voa eval diff

The eval browser layer is LLM-driven — a Claude Sonnet 4.6 vision agent navigates the live app, selects elements, and decides when an exploration goal has succeeded or stalled. It is not a Playwright test runner.

Design phase

voa design persona list|generate-prompt|import|validate
voa design workflow list|generate-prompt|import
voa design analyze gaps|coverage

Bridge (cross-layer)

voa bridge status
voa bridge sync-gaps

Diagnostic

voa doctor [--offline]                     # pre-flight check of Python, API key, Playwright, disk

Programmatic API

# Primary API — one function, plain-English inputs
from voice_of_agents.research import quick_research_sync
result = quick_research_sync(what=..., who=..., understand=...)

# Full pipeline — for complete research sessions
from voice_of_agents.research import run_full_pipeline_sync
from voice_of_agents.research.config import ResearchConfig
config = ResearchConfig.from_file(Path("research-config.yaml"))
session = run_full_pipeline_sync(config)

# Plain-English config — no methodology vocabulary required
import asyncio
config = asyncio.run(ResearchConfig.from_plain_english(
    what="a Slack bot that tracks action items",
    who="engineering managers at startups",
    understand="why teams stop using action item trackers",
))

# Real signal ingestion — augment synthetic research with real data
from voice_of_agents.research.signals import from_transcripts, from_csv
signals = from_transcripts(["interview1.txt", "interview2.txt"])
signals2 = from_csv("nps_responses.csv", text_column="comment")

# Research → eval bridge
from voice_of_agents.research.bridge import session_to_personas
personas = session_to_personas(session)  # list[Persona] ready for eval

# Cost estimation before any API calls
from voice_of_agents.research.cost import estimate_run_cost
estimate = estimate_run_cost(model="claude-opus-4-7", subject_count=12)
print(estimate.display())

Data model

Everything is file-based and git-friendly:

  • research-sessions/*.yaml — ResearchSession state (resumable after any stage)
  • research-sessions/SYNTHETIC-DATA-NOTICE.md — honest framing + validation questions
  • research-sessions/DECISION-REPORT.md — what to build, kill, and validate
  • data/personas/P-*.yaml — canonical persona definitions (Pydantic-validated)
  • data/results/{id}-{name}/{timestamp}/ — per-persona, timestamped eval results
  • data/backlog.jsonl — append-only backlog event log

Design principles

  • Research is a forcing function, not a conclusion. Every finding is a hypothesis until validated with a real user.
  • Include the users who left. The sampling frame mandates a minimum of abandoners, rejecters, and critics.
  • Epistemic honesty is non-negotiable. Every session writes SYNTHETIC-DATA-NOTICE.md. There is no option to suppress it.
  • Personas are explorers, not test scripts. Objectives are fixed; journeys adapt.
  • Append-only persistence. Nothing is ever deleted.

Contributing

Read the Manifesto to understand the worldview. Fork it, argue with it, open a PR.

git clone https://github.com/blakeaber/voice-of-agents.git
pip install -e ".[dev]"
pre-commit install
pytest

See CONTRIBUTING.md for branch conventions and PR guidelines. Security issues go to SECURITY.md (not public GitHub issues).


License

MIT © 2026 Blake Aber. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voice_of_agents-0.1.0.tar.gz (272.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voice_of_agents-0.1.0-py3-none-any.whl (147.5 kB view details)

Uploaded Python 3

File details

Details for the file voice_of_agents-0.1.0.tar.gz.

File metadata

  • Download URL: voice_of_agents-0.1.0.tar.gz
  • Upload date:
  • Size: 272.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voice_of_agents-0.1.0.tar.gz
Algorithm Hash digest
SHA256 72038705b1557df750f0ed5f2c2388c5a830b71e4e485a0e86f70ac5dcbc88ba
MD5 d39d117e9a77f0567d17d97a1e0865f3
BLAKE2b-256 1d3063b9940e424bfd2dee67d0216bd5b15e79250368be68bfa39de5068e327f

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_of_agents-0.1.0.tar.gz:

Publisher: release.yml on blakeaber/voice-of-agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file voice_of_agents-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voice_of_agents-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 147.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for voice_of_agents-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 46afed5d7627bbe07e158d718d0a1b30fbe6ca21fb2905ec304e1e5ee48886f9
MD5 9f24add080f650586bcfffca80c9811c
BLAKE2b-256 a4b22abcf770938ada1aee42ff9484a53df1f25c4d9742e76ed44b1198c67008

See more details on using hashes here.

Provenance

The following attestation bundles were made for voice_of_agents-0.1.0-py3-none-any.whl:

Publisher: release.yml on blakeaber/voice-of-agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page