Autonomous AI agent that builds, tests, and deploys web and native iOS applications from ideas

These details have not been verified by PyPI

Project links

Project description

Product Agent v12.4

An autonomous AI agent that builds, tests, and deploys web and native iOS applications from plain English descriptions.

What It Does

product-agent "Build me a todo app with user authentication"

Output:

Product Agent v12.4 — Building: "Build me a todo app with user authentication"

[1/9] Enriching prompt...                    done   12s
[2/9] Analyzing stack... → nextjs-supabase   done    8s
[3/9] Designing architecture...              done   45s
[4/9] Reviewing design... APPROVED           done   15s
[5/9] Building application...                done 3m22s
[6/9] Auditing spec... 12/12 met             done   20s  (parallel)
[7/9] Running tests... 14/14 passed          done   35s  (parallel)
[8/9] Deploying to Vercel...                 done   45s
[9/9] Verifying deployment...                done   10s

BUILD COMPLETE  5m 42s
  URL: https://todo-app-abc123.vercel.app
  Tests: 14/14 passed
  Spec: 12/12 requirements met
  Quality: A (95%)

One prompt in, production app out. No human intervention required.

What's New in v12.4

Enterprise Security Audit (v12.4)

Comprehensive security hardening for production/demo readiness:

Review validation fix — Missing or garbled REVIEW.md no longer silently auto-approves designs. Failed review calls trigger revision instead of bypass.
Sanitization hardened — 11 injection patterns (synced with Shipwright hooks), zero-width unicode stripping, NFC normalization, HTML entity detection.
Recovery patterns — Added Next.js 16 (proxy.ts migration, 'use cache') and Tailwind CSS 4.x error recovery.
Retry backoff — Linear backoff (5s, 10s, 15s...) between build attempts to avoid hammering transient failures.
JSONL rotation — Build history auto-rotates at 500 records with fcntl.flock() file locking for concurrent safety.
Checkpoint cleanup — Old checkpoints cleaned up after successful builds.
Strict artifact verification — New STRICT_ARTIFACT_VERIFICATION config flag aborts builds on tampered checkpoint artifacts (enterprise hardening).
Dependency pinning — All dependencies have upper bounds to prevent supply chain attacks.

Previous Versions

v12.3 — PyPI publish workflow, public release prep
v11.1 — AI app domain, CI/CD generation, Vercel Analytics observability
v11.0 — 3 new stacks: Django+HTMX, SvelteKit, Astro (8 stacks total)
v10.3 — Template modernization: Next.js 16 patterns, async APIs, Cache Components
v10.2 — Enhancement mode fully wired into pipeline
v10.0 — Post-mortem fixes: dependency audit, data wiring, RLS circular deps, CRITICAL override
v9.1 — Crash recovery: --resume, atomic checkpoints, artifact verification
v9.0 — Reliability overhaul: quality scoring, YAML contracts, SDK logging, timeouts
v8.0 — Phase-by-phase orchestration, build memory, quality scoring, public API
v7.0 — Swift/SwiftUI stack, plugin build mode, NCBSPlugin protocol, XCTest
v6.0 — Spec audit, prompt enrichment, content site domain
v5.0 — Deployment validation, verification, checkpoints, automated testing

Quick Start

1. Install

cd product-agent
pip install -e .

2. Authentication

Product Agent runs through Claude Code using your existing subscription (Pro or Max). No API key is needed — just make sure you're logged into Claude Code:

claude login

Optional environment variables for integrations:

export GITHUB_TOKEN=ghp_...          # Optional - GitHub MCP
export SUPABASE_ACCESS_TOKEN=...     # Optional - Supabase MCP
export VERCEL_TOKEN=...              # Optional - Vercel MCP

Note: Product Agent automatically uses your Claude Code subscription. It does NOT require an ANTHROPIC_API_KEY. If you have one set in your shell profile, it will be ignored to avoid unexpected API charges.

3. Run

product-agent "Build me a simple blog"

Available Stacks

The agent automatically selects the best stack for your product:

Stack	Best For	Database	Deploys To
nextjs-supabase (default)	SaaS, internal tools, dashboards, content sites	PostgreSQL (Supabase)	Vercel
nextjs-prisma	Marketplaces, multi-tenant, complex data	PostgreSQL	Vercel
rails	Rapid prototyping, admin-heavy apps	PostgreSQL	Railway
django-htmx	Admin panels, data apps, Python backends	PostgreSQL	Railway
sveltekit	Fast SaaS, dashboards, interactive apps	Any	Vercel
astro	Blogs, docs, landing pages, portfolios	None (static)	Vercel
expo-supabase	Mobile apps, consumer apps	PostgreSQL (Supabase)	App Stores
swift-swiftui	Native iOS apps, plugin modules	Local storage	TestFlight / SPM

Force a specific stack:

product-agent --stack django-htmx "Build a data management admin panel"
product-agent --stack astro "Build a documentation site"
product-agent --stack sveltekit "Build a fast dashboard app"

How It Works

Architecture

User → build_product() →
  ┌─────────────────────────────────────────────────────────────┐
  │                   PYTHON ORCHESTRATOR                        │
  │                  (agent/orchestrator.py)                     │
  │                                                             │
  │  run_phase(ENRICH)   → validate → checkpoint → progress     │
  │  run_phase(ANALYZE)  → validate → checkpoint → progress     │
  │  run_phase(DESIGN)  ←→ run_phase(REVIEW) loop              │
  │  run_phase(BUILD)    → validate → retry with error context  │
  │  run_phase(AUDIT)  ┐                                        │
  │  run_phase(TEST)   ┘→ parallel → validate → progress       │
  │  run_phase(DEPLOY)   → validate → checkpoint → progress     │
  │  run_phase(VERIFY)   → validate → checkpoint → progress     │
  └─────────────────────────────────────────────────────────────┘
→ BuildResult with URL, quality score, metrics

Enhancement mode replaces Analyze + Design/Review with a single ENHANCE phase that modifies an existing design.

Pipeline Phases

#	Phase	What It Does	Validates
1	Enrich (optional)	Researches domain, expands idea into detailed spec	PROMPT.md exists, 100+ chars
2	Analyze	Selects optimal tech stack	STACK_DECISION.md with valid stack ID
3	Design	Creates data model, pages, components	DESIGN.md with required sections
4	Review	Validates design (loop, max 3 revisions)	REVIEW.md with APPROVED/NEEDS_REVISION
—	Enhance (enhancement mode)	Modifies existing design with new features	DESIGN.md updated with (NEW) markers
5	Build	Implements full application (max 5 attempts)	Source files exist, entry point present
6	Audit	Verifies build matches requirements	SPEC_AUDIT.md with pass/fail counts
7	Test	Generates and runs tests	TEST_RESULTS.md with pass/fail counts
8	Deploy	Deploys to production + sets up CI/CD	Deployment URL extracted
9	Verify	Tests deployed app	VERIFICATION.md with status

Audit and Test run in parallel. Design loops with Review until approved.

Subagents

Agent	Purpose	Max Turns
enricher	Researches domain, expands ideas into specs	20
analyzer	Selects stack, validates compatibility	15
designer	Creates DESIGN.md with architecture	25
reviewer	Validates design completeness	15
enhancer	Adds features to existing designs	40
builder	Implements app with cross-referencing	80
auditor	Audits build against original requirements	20
tester	Generates and runs tests	30
deployer	Deploys with pre-validation + observability setup	25
verifier	Tests deployed app	15

Usage Examples

Basic (Fully Autonomous)

product-agent "Build a task management app"

AI-Powered Apps

product-agent "Build a chatbot for customer support with conversation history"
product-agent "Build an AI-powered writing assistant"

With Prompt Enrichment

product-agent --enrich "Build a dental charity nonprofit website"

Research a reference site:

product-agent --enrich-url "https://example-nonprofit.org" \
  "Rebuild this nonprofit website"

Enhancement Mode

product-agent --design-file ./project/DESIGN.md \
  --enhance-features "board-views,dashboards" \
  "Enhance project management app"

Programmatic API

from agent.api import build, BuildConfig

result = await build(
    idea="Create a marketplace for vintage guitars",
    config=BuildConfig(
        stack="nextjs-prisma",
        enrich=True,
        require_passing_tests=True,
    ),
)

print(result.url)          # https://vintage-guitars.vercel.app
print(result.quality)      # A- (92%)
print(result.test_count)   # 14/14
print(result.duration_s)   # 342.5

Build Modes

# Standard web app
product-agent "Build a project management tool"

# Swift plugin module
product-agent --stack swift-swiftui --mode plugin \
  "Photo gallery plugin with compressed local albums"

# iOS host app
product-agent --stack swift-swiftui --mode host \
  "SaaS dashboard with team management and analytics"

Other Options

# Custom project directory
product-agent --project-dir ./my-app "Build a todo app"

# Force a specific stack
product-agent --stack rails "Build an admin dashboard"

# With checkpoints (human approval between phases)
product-agent --checkpoints "Build an e-commerce store"

# Resume from checkpoint
product-agent --resume "Build an e-commerce store"

# Legacy mode (v7.0 single-subprocess architecture)
product-agent --legacy "Build a simple todo app"

# Verbose output
product-agent --verbose "Build a blog"

Build Memory

Every build is logged to .agent_history/builds.jsonl:

{
  "id": "20260212_143022",
  "idea": "Team todo app with real-time sync",
  "stack": "nextjs-supabase",
  "outcome": "success",
  "total_duration_s": 342,
  "test_count": 14,
  "tests_passed": 14,
  "quality_grade": "B+",
  "failure_reasons": [],
  "lessons": ["Always verify Supabase RLS policies with auth.uid()"]
}

Before starting a new build, the agent searches for similar past builds using Jaccard similarity and injects patterns from successful builds into the pipeline context. Failure reasons and lessons are recorded per build and injected into builder prompts to prevent repeating mistakes.

Quality Scoring

After all phases complete, a 5-factor quality score is computed:

Factor	Weight	What It Measures
Functional Verification	35 pts	Did deployed endpoints return expected results?
Test Pass Rate	25 pts	Were tests generated and did they pass?
Spec Coverage	20 pts	How many requirements were met in audit?
Build Efficiency	10 pts	How many build attempts were needed?
Design Quality	10 pts	How many design revisions were needed?

Hard caps: deployment_verified=False caps grade at C. tests_generated=False caps grade at B-. spec_audit_critical_count > 0 caps grade at B.

Grades: A (95+), A- (90+), B+ (85+), B (80+), B- (70+), C (60+), F (<60)

Development

File Structure

agent/
├── main.py                 # CLI entry point, v8/legacy routing
├── orchestrator.py         # BuildConfig, BuildResult, build_product()
├── api.py                  # Clean public API
├── cli_runner.py           # SDK-based run_phase_call() with timeouts
├── validators.py           # Code-level output validation + YAML front-matter
├── progress.py             # Real-time progress streaming
├── history.py              # Build memory with failure learning
├── quality.py              # Outcome-based quality scoring
├── sanitize.py             # Input sanitization
├── config.py               # Environment configuration
├── state.py                # Phase and state management
├── checkpoints.py          # Checkpoint system with cleanup/archive
├── recovery.py             # Error recovery
├── test_validation.py      # Test result parsing
├── phases/                 # Phase modules with SDK logging
│   ├── __init__.py         # Phase registry, run_phase() dispatcher
│   ├── enrich.py           # Enricher phase
│   ├── analyze.py          # Stack analysis phase
│   ├── design.py           # Design phase
│   ├── review.py           # Design review phase
│   ├── enhance.py          # Enhancement phase (v10.2)
│   ├── build.py            # Build phase
│   ├── audit.py            # Spec audit phase
│   ├── test.py             # Test phase
│   ├── deploy.py           # Deploy phase
│   └── verify.py           # Verify phase
├── agents/
│   └── definitions.py      # 10 subagent prompts + per-stack template injection
├── stacks/
│   ├── criteria.py         # 8 stack definitions and scoring
│   ├── selector.py         # Stack selection logic
│   └── templates/          # Stack-specific templates
│       ├── nextjs-supabase/
│       ├── nextjs-prisma/
│       ├── rails/
│       ├── django-htmx/    # v11.0
│       ├── sveltekit/      # v11.0
│       ├── astro/          # v11.0
│       ├── expo-supabase/
│       └── swift-swiftui/
├── domains/
│   ├── __init__.py         # Domain registry
│   ├── marketplace/
│   ├── saas/
│   ├── internal_tool/
│   ├── content_site/
│   ├── plugin_host/
│   ├── plugin_module/
│   └── ai_app/             # v11.0
├── hooks/
│   ├── safety.py           # Safety hooks
│   └── progress.py         # Progress reporting
└── mcp/
    └── servers.py          # MCP configurations

Running Tests

pip install -e ".[dev]"
python3 -m pytest tests/ -v

1,544+ unit tests across 35 test files (plus stress tests requiring live SDK calls):

Test File	Tests	Coverage
`test_orchestrator_v8.py`	114	Full v8 pipeline, retry, quality gate, parallel phases
`test_safety.py`	183	Blocked commands, shell-aware splitting, path protection
`test_phases.py`	142	Phase registry, PhaseConfig, run_phase, SDK logging
`test_swift_modes.py`	100	Swift state, criteria, prompts, domains
`test_validators.py`	93+	All phase validators, YAML front-matter, extraction
`test_new_stacks.py`	80	Django, SvelteKit, Astro definitions, selection, templates
`test_quality.py`	77+	Outcome-based scoring, grade caps, report formatting
`test_history.py`	78	BuildRecord, failure learning, similarity search
`test_agent_prompts.py`	74	Registry, tools, all 10 agents, template injection
`test_progress.py`	55	PhaseResult, ProgressReporter, formatting
`test_checkpoints.py`	44	Save/load/resume, cleanup, archive, phase cap
`test_stack_selection.py`	44	Keyword analysis, scoring, selection
`test_enhancement.py`	35	Enhancement phase, validator, orchestrator, serialization
`test_config.py`	33	Env var loading, feature flags, defaults
`test_orchestration.py`	34	Legacy orchestration, build modes, prompt content
`test_sanitize.py`	25	Input sanitization, injection markers, edge cases
`test_pipeline_features.py`	21	AI domain, CI/CD, observability
+ 5 more	~200+	Recovery, validation, state v5/v6, CLI runner

Domain Patterns

Domain	Product Types	Key Patterns
marketplace	Marketplaces, two-sided platforms	Buyer/seller flows, listings, transactions
saas	SaaS, multi-tenant apps	Organizations, subscriptions, billing
internal_tool	Admin panels, dashboards	Data tables, CRUD, reporting
content_site	Nonprofits, portfolios, blogs	Static-first data, hero sections, FAQ accordion
ai_app	Chatbots, AI assistants, AI tools	AI SDK v6, streamText, useChat, chat history
plugin_host	iOS plugin host apps	Plugin registry, shared services, dynamic TabView
plugin_module	Swift Package plugins	NCBSPlugin protocol, MVVM, compressed storage

CLI Arguments

product-agent [OPTIONS] IDEA

Arguments:
  IDEA                      The product idea to build

Options:
  --project-dir DIR         Project directory (default: ./projects/new-product)
  --stack STACK             Force stack: nextjs-supabase, nextjs-prisma, rails, django-htmx, sveltekit, astro, expo-supabase, swift-swiftui
  --mode MODE               Build mode: standard, host, plugin
  --checkpoints             Enable checkpoints for human approval
  --resume                  Resume from most recent checkpoint
  --resume-from ID          Resume from specific checkpoint ID
  --list-checkpoints        List available checkpoints and exit
  --legacy                  Use legacy v7.0 mode (single subprocess)
  --design-file PATH        Existing DESIGN.md for enhancement mode
  --enhance-features LIST   Comma-separated: board-views,dashboards,automations
  --enrich                  Enable prompt enrichment phase
  --enrich-url URL          Reference URL for enrichment research (implies --enrich)
  --verbose                 Show detailed progress

Safety Features

The agent includes enterprise-grade safety hooks:

Block dangerous commands (rm -rf /, fork bombs, disk writes, sudo, eval, piped downloads) with shell-aware splitting that respects quoted strings
Protect system directories (/etc, /usr, /System) and credential files (.ssh, .aws, .pem, .key, .p12, .pfx)
Auto-approve safe operations (npm, git, file writes in project)
Validate deployment compatibility (SQLite + Vercel = blocked)
Sanitize user input with 11 injection patterns, zero-width unicode stripping, NFC normalization, HTML entity detection, and 5,000-char length cap
Limit total tool turns per build (300) and per phase to prevent infinite loops
Rotate build history logs at 500 records with file locking for concurrent safety
Verify checkpoint artifact integrity with SHA-256 hashes (optional strict mode aborts on mismatch)

Version History

Version	Description
v12.4	Enterprise audit: security hardening, review validation fix, encoded attack detection, retry backoff, JSONL rotation, dependency pinning
v12.3	PyPI publish workflow, public release prep
v11.1	AI app domain, CI/CD generation, Vercel Analytics observability
v11.0	3 new stacks: Django+HTMX, SvelteKit, Astro. 8 stacks total
v10.3	Template modernization: Next.js 16 patterns, async APIs, Cache Components
v10.2	Enhancement mode fully wired into pipeline
v10.0	Post-mortem fixes: dependency audit, data wiring, RLS circular deps, CRITICAL override
v9.1	Crash recovery: `--resume`, atomic checkpoints, artifact verification
v9.0	Reliability overhaul: quality scoring, YAML contracts, SDK logging, timeouts
v8.0	Phase-by-phase orchestration, build memory, quality scoring, public API
v7.0	Swift/SwiftUI stack, plugin build mode, NCBSPlugin protocol, XCTest
v6.0	Spec audit, prompt enrichment, content site domain
v5.0	Deployment validation, verification, checkpoints, automated testing

Requirements

Python 3.10+
Claude Code CLI (npm install -g @anthropic-ai/claude-code)
Claude Pro subscription
Node.js 20+ (for generated web apps)
Swift 5.9+ / Xcode 15+ (for Swift/SwiftUI builds)
Ruby 3.1+ / Rails 7+ (for Rails builds)
Python 3.10+ (for Django builds)

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

12.4.1

Apr 6, 2026

12.4.0

Apr 6, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

product_agent-12.4.1.tar.gz (182.1 kB view details)

Uploaded Apr 6, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

product_agent-12.4.1-py3-none-any.whl (210.5 kB view details)

Uploaded Apr 6, 2026 Python 3

File details

Details for the file product_agent-12.4.1.tar.gz.

File metadata

Download URL: product_agent-12.4.1.tar.gz
Upload date: Apr 6, 2026
Size: 182.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for product_agent-12.4.1.tar.gz
Algorithm	Hash digest
SHA256	`03bd5c456379826bda86f577f96f5d6c1a649c79fcfeee900ed4e4001ed194d7`
MD5	`fbf371a72f6a2273355cb2baf73102a8`
BLAKE2b-256	`644b0aa2469c1ebb8c7a2c8702717ab46d8630ec47c5e8dfb41108bec4735e13`

See more details on using hashes here.

File details

Details for the file product_agent-12.4.1-py3-none-any.whl.

File metadata

Download URL: product_agent-12.4.1-py3-none-any.whl
Upload date: Apr 6, 2026
Size: 210.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for product_agent-12.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`edde1a19733c40b882ec4d3277c424f9787435d8d162cc324da90ae62c2a9ef8`
MD5	`97d3efe7cfa8f3c92969a248f6184f25`
BLAKE2b-256	`9bffe3de1419362bc0b18e5d438a3a6b3763920c381b5c8b482faa95dbb4e193`

See more details on using hashes here.

product-agent 12.4.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Product Agent v12.4

What It Does

What's New in v12.4

Enterprise Security Audit (v12.4)

Previous Versions

Quick Start

1. Install

2. Authentication

3. Run

Available Stacks

How It Works

Architecture

Pipeline Phases

Subagents

Usage Examples

Basic (Fully Autonomous)

AI-Powered Apps

With Prompt Enrichment

Enhancement Mode

Programmatic API

Build Modes

Other Options

Build Memory

Quality Scoring

Development

File Structure

Running Tests

Domain Patterns

CLI Arguments

Safety Features

Version History

Requirements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes