Voice/text runner for FlowStore v0 specs — Pipecat-based dispatcher with browser audio bridge.

Project description

flowstore-runner

Voice/text runner for FlowStore v0 specs. Pipecat-based dispatcher with self-contained browser test pages for both modes.

Plan and rationale: RUNNER-PLAN.md. Strategy: ../whatsupp2/STRATEGY.md. Editor integration is wired through the /api/chat/* endpoints below.

Status

Phase 0 ✅ — bare audio loop end-to-end (Pipecat WebRTC + Google STT/LLM/TTS).
Phase 1 ✅ — v0 dispatcher: spec-driven flow interpretation, three-method routing (direct / calculation / llm), interrupts with goto: "RETURN", post-exit assigns + capability dispatch, event emission.
Phase 1.5 ✅ — text I/O adapter: /api/chat/{session,turn,end} endpoints, BYOK or env-fallback auth, context_vars for placeholder substitution + initial variable seeding.
Phase 2 (text path) ✅ — editor canvas highlighting from event stream. FlowNode rings the active flow; edges pulse on exit_path_taken; variables stream into the simulate panel. Wired through HTTP (Phase 1.5's /api/chat/*), not SSE. Voice-mode → editor integration (SSE broker, /api/offer event subscription) deferred until a voice consumer needs it.
Session event log persistence ✅ — set UXFLOWS_EVENT_LOG_DIR and every session writes a {session_id}.jsonl file alongside the live emitter. Per-session file = one file replay. Voice + text both honor it.
Voice-mode parity with text ✅ — silent-take_exit follow-up (tool-less re-inference when Gemini fires a routing tool with no spoken text) and context_vars on POST /api/offer (mirrors /api/chat/session).

Setup

Install uv if not already installed.
Sync dependencies:
```
uv sync
```
Auth — pick one (or both):
- Vertex (default, used by voice mode and text-mode env-fallback): drop your GCP service-account JSON at data/credentials.json. The data/ directory is gitignored. Service account needs Vertex AI User (Gemini), Cloud Speech Client (STT), Cloud Text-to-Speech User (TTS).
- AI Studio (BYOK, text mode only): get a key from https://aistudio.google.com/app/apikey. No setup file — passed per-session via the API.
Copy the env template and fill in your project ID:
```
cp .env.example .env
$EDITOR .env
```

Run

uv run flowstore-runner serve

Then pick the mode you want:

Mode	URL	What it does
Voice	http://localhost:8000/	Real-time voice via WebRTC. Mic in, TTS out. Full dispatcher loop.
Text	http://localhost:8000/text.html	Text chat against the dispatcher. No audio, no STT/TTS. Same routing/state/events as voice.
Bare audio	http://localhost:8000/audio-test.html	Audio loop with a hardcoded prompt — no spec, no dispatcher. Use to debug voice/STT/VAD when something feels off.

curl localhost:8000/health returns {"ok": true} once the server has loaded.

Quick sanity test (text, headless)

SPEC=$(cat examples/coffee.json)
curl -s -X POST http://localhost:8000/api/chat/session \
  -H "Content-Type: application/json" \
  -d "{\"spec\": $SPEC}" | python3 -m json.tool

You should get back a session_id, an opening agent turn, and session_started + flow_entered events. Send a turn:

curl -s -X POST http://localhost:8000/api/chat/turn \
  -H "Content-Type: application/json" \
  -d '{"session_id":"<paste-id>","user_text":"i'"'"'d like a latte"}' | python3 -m json.tool

API surface

Three endpoints power the text mode (canonical implementations in server/app.py):

POST /api/chat/session — start a session. Body: {spec, api_key?, model?, language?, context_vars?}. Returns {session_id, agent_text, events, ended}. api_key falls back to env Vertex creds when omitted; context_vars seeds the variable bag and substitutes {KEY} placeholders in the system prompt (case-insensitive; unfilled placeholders stay as {KEY} literal).
POST /api/chat/turn — send a user turn. Body: {session_id, user_text}. Returns {agent_text, events, ended}.
POST /api/chat/end — explicit cleanup. Idle sessions get GC'd after 30 min regardless.

Voice uses WebRTC SDP exchange via /api/offer (Pipecat's SmallWebRTCRequestHandler).

Layout

src/flowstore_runner/
  cli.py                         # `flowstore-runner serve`
  config.py                      # env-driven config
  spec/
    types.py, loader.py          # pydantic v0 spec types + per-spec lookup tables
  dispatcher/
    flow_state.py                # stack-based FlowState (interrupts push/pop) + variable bag
    expressions.py               # calculation engine (simpleeval-based)
    methods.py                   # three-method evaluator (direct / calculation / llm)
    routing.py                   # plan() + resolve() — exit-path eval, interrupt collection
    assigns.py                   # exit-fired variable assignment
    capabilities.py              # HTTP fire-and-forget + retrieval stub
    prompt_builder.py            # per-flow system prompt + per-turn tool schema + {KEY} substitution
    processor.py                 # Pipecat seam — PreLLMPlanner, tool handlers, PostLLMResolver, apply_tool_call
    session.py                   # per-connection Session (FlowState + LLMContext + emitter)
  events/
    schema.py                    # pydantic event types — runner-side contract
    emitter.py                   # LoggingEmitter / QueueEmitter / BufferingEmitter
  server/
    app.py                       # FastAPI: /api/offer (voice), /api/chat/* (text), static /web mount
    pipeline.py                  # Pipecat voice pipeline
    pipeline_raw.py              # Bare audio pipeline (no spec/dispatcher)
    text_session.py              # Text adapter — drives dispatcher per-turn without Pipecat audio
    text_registry.py             # In-memory TextSession registry + idle GC
web/
  index.html, client.js          # voice debug page (vanilla RTCPeerConnection)
  text.html, text.js, text.css   # text debug page (vanilla fetch)
  audio-test.html, audio-test.js # bare audio debug page
  style.css
examples/
  coffee.json                    # self-contained order-bot spec
data/
  credentials.json               # GCP service-account JSON (gitignored)
tests/                           # 95 passing — dispatcher core + text-mode e2e

Why three debug pages

Each page isolates a layer:

voice page (/) — full stack, the canonical voice surface.
text page (/text.html) — same dispatcher as voice, no audio. Use when you want to iterate on a spec's logic without burning STT/TTS minutes, when you're on a flaky mic, or when reviewing flow transitions visually beats listening. Real designers use this through the editor's Simulate panel (../flowstore/components/runtime/SimulatePanel.tsx); the standalone page is for runner-side dev work.
bare audio page (/audio-test.html) — no spec, no dispatcher, hardcoded prompt. First-pass triage for "is the audio path or the dispatcher misbehaving?" If the issue reproduces here, it's audio.

All three stay around as debug surfaces forever. They don't disappear when the editor integration lands.

Known rough edges

Walkaway gap — Gemini sometimes responds to a graceful goodbye with text only, no take_exit_path. Session stays "live" until idle GC. Documented in RUNNER-PLAN §"Live-test follow-up". Click Reset (or close the tab) to recover.
Single user, single host — v0 deployment model. Concurrent sessions across one process should work but are untested at scale.

Project details

Release history Release notifications | RSS feed

This version

0.1.0

May 25, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

flowstore_runner-0.1.0.tar.gz (49.2 kB view details)

Uploaded May 25, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

flowstore_runner-0.1.0-py3-none-any.whl (63.5 kB view details)

Uploaded May 25, 2026 Python 3

File details

Details for the file flowstore_runner-0.1.0.tar.gz.

File metadata

Download URL: flowstore_runner-0.1.0.tar.gz
Upload date: May 25, 2026
Size: 49.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.17

File hashes

Hashes for flowstore_runner-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`09880128820fe3cf59bce4a9ed52a9b655e3fab76ba6636aa8157d978a988fae`
MD5	`5ec407441e8e8090b7061e83ef5afa46`
BLAKE2b-256	`4f658d0c34497c36371c06f9a1c24275d351adf741956a831f71aea842e1c78b`

See more details on using hashes here.

File details

Details for the file flowstore_runner-0.1.0-py3-none-any.whl.

File metadata

Download URL: flowstore_runner-0.1.0-py3-none-any.whl
Upload date: May 25, 2026
Size: 63.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.8.17

File hashes

Hashes for flowstore_runner-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1f3be03f9524034288dc1a3342539e2bc2f563a5c3e4c4d993a3f62eb4a5a392`
MD5	`ba1cc1d6e14a75ab7edc90d306da0c68`
BLAKE2b-256	`0a70aa58a16f97bd70d442ed0c1f438d3aa82e61e91ae2b11096bd10ef44eec5`

See more details on using hashes here.

flowstore-runner 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

flowstore-runner

Status

Setup

Run

Quick sanity test (text, headless)

API surface

Layout

Why three debug pages

Known rough edges

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes