Standalone dialog state machine framework — text in, text out.

These details have not been verified by PyPI

Project links

Project description

SuperDialog

Standalone dialog framework. Pure text in, pure text out.

SuperDialog engine contract: host platforms (LiveKit, PipeCat, FastAPI, WebSocket, CLI) connect through superdialog.adapters and SessionWorker to one Agent protocol, behind which sit two interchangeable engines — PlaybookAgent (default) and DialogMachine (legacy).

SuperDialog is the brain layer for conversational systems. It ships two engines behind one text interface: Playbook, the default - a checkpoint-based compound runtime (a fast streaming Talker plus an async Director) for fluid, outcome-driven conversations - and DialogMachine, the supported legacy engine, a graph-railed dialog state machine that executes flow graphs deterministically. Both manage turn-by-turn logic, tool calls, transitions, and conversation memory; both speak the same Agent protocol, so every host adapter runs either one unchanged.

User text → agent.turn() → reply text — so every dialog is a plain, unit-testable function: no audio fixtures, no phone number, no API keys to test a conversation.

Audio, STT, TTS, telephony, and media servers are out of scope - those belong to voice infrastructure like LiveKit, PipeCat, or the Unpod Voice Platform. SuperDialog ends at text in, text out.

SuperDialog is to conversation flow what n8n is to integration workflow - a small, composable, eval-able runtime for orchestrating turn-by-turn logic. Where LangChain and LangGraph expose general agent primitives, SuperDialog focuses narrowly on the conversational runtime: who speaks next, what to extract, when to call a tool, when to escalate.

Why standalone

The brain has natural reuse beyond voice. A dialog brain that runs a customer-onboarding conversation works the same whether the user is on a phone, a WhatsApp thread, an Intercom widget, or a CLI test harness. Coupling it to telephony forecloses every non-voice use case.

The dependency direction matters. Voice infrastructure should depend on SuperDialog (as one brain option), not the other way around - keeping the framework portable and the platform composable.

Because the interface is text-only, every dialog is a unit-testable function. No audio fixtures, no API keys, no phone number to test a conversation.

Install

pip install superdialog

Install only the extras you need:

pip install superdialog[livekit]    # LiveKit adapter
pip install superdialog[pipecat]    # PipeCat adapter
pip install superdialog[fastapi]    # FastAPI adapter + uvicorn
pip install superdialog[ws]         # WebSocket runner
pip install superdialog[mcp]        # MCP tool support
pip install superdialog[langchain]  # LangChainAgent

Quickstart A - Playbook engine (default)

A Playbook declares a conversation as journeys of checkpoints - goal, typed slots, guidance prose, advance rules - plus a process layer of tools, pipelines, handlers, and policies. Checkpoints gate outcomes, not utterances: a fast Talker LLM streams every spoken turn while the async Director extracts slots, judges advancement, and runs tools, both over an append-only event log that doubles as the audit/replay artifact.

Start with the simple format - prose steps, a structured persona, and reference data as plain YAML. It is what superdialog generate produces, and every loader and command accepts it directly:

# playbook.yaml (simple format)
goal: "Book a haircut and confirm it."
persona:
  name: Mira
  language: ["en", "hi"]
  identity: "You are Mira, a booking assistant for Glow Studio."
  voice_style: "Warm and brief. One question at a time."
playbook:
  - id: greet
    purpose: "Open the call."
    say: "Greet the caller and ask how you can help."
    done_when: "Caller is ready to book."
  - id: collect
    purpose: "Get the booking details."
    say: "Ask for their name and preferred service."
    collect: [name, service]
    done_when: "Name and service are captured."
  - id: confirm
    purpose: "Confirm and close."
    say: "Read back the booking and confirm."
    done_when: "Caller has confirmed."
facts:
  canonical_pricing: {haircut: "₹400"}
boundaries: ["NEVER invent prices."]
interrupts:
  - {when: "Caller says goodbye.", to: main.confirm}

When you need precision the simple format can't express - tools, pipelines, hard gates, typed slots, multiple outcomes - graduate to the full format (same engine, same loader; see docs/04-playbook-guide.md Part 1):

# booking.yaml (full format)
persona: "You are a booking assistant."
env: {API_BASE_URL: "https://api.example.com"}
journeys:
  booking:
    checkpoints:
      - id: collect
        goal: "Have city and date"
        slots:
          city: {type: str, required: true}
          date: {type: date, required: true}
        guidance: "Collect naturally."
        advance_when:
          - {when: "details complete", judge: llm, to: booking.confirm,
             requires: [city, date]}
      - id: confirm
        gate: hard
        say_verbatim: "Your booking is held."
        pipeline: confirm_and_hold
        advance_when:
          - {when: "pipeline.ok", judge: expr, to: booking.close}
      - id: close
        terminal: true
        outcome: confirmed
tools:
  - id: hold_slot
    method: POST
    url: "{{ env.API_BASE_URL }}/slots/hold"
    store_response_as: hold_result
pipelines:
  - id: confirm_and_hold
    steps:
      - tool: hold_slot
        on: {ok: continue, failed: {retry: 1, on_exhaust: booking.collect}}

DialogMachine is the recommended entry point: one class, one model URI. It runs the Playbook engine by default; pass engine="flow" for the legacy graph runtime.

import asyncio
from superdialog import DialogMachine

agent = DialogMachine("booking.yaml", llm="openai/gpt-4.1-mini")

async def chat():
    reply = await agent.turn("Hi, I'd like to book something.")
    print(reply.text)

asyncio.run(chat())

The single llm= is the Talker, and the Director too unless you split them with director_llm= (the cheap-Talker / strong-Director latency split). Pass any Tool via tools= - each runs its own execute(), both engines.

Streaming is real, not cosmetic: await agent.turn(text, stream=True) yields StreamChunks as the Talker produces tokens, and barge-in (aborting the stream) interrupts speech without losing the Director's decision. The event log replays offline - superdialog.playbook.replay re-runs the Director over a recorded session and diffs every decision.

Advanced: explicit Talker/Director. Drop to PlaybookAgent when you need to supply scripted LLMs, a custom HTTP executor, or inspect the runtime directly. Adapting providers is two lambdas - or use the bundled superdialog.playbook.provider_adapters(provider), which returns the (director, talker) pair for any LLMProvider:

from superdialog.playbook import Playbook, PlaybookAgent, httpx_http

agent = PlaybookAgent(
    playbook=Playbook.load("booking.yaml"),
    talker_llm=talker,      # StreamsLLM: stream(messages) -> AsyncIterator[str]
    director_llm=director,  # CompletesLLM: async complete(messages) -> str
    http=httpx_http,        # HTTP executor for declared tools (Jinja-sandboxed templates)
)

Or skip the Python entirely - generate a playbook from a description and chat against it. The Playbook engine is the default for every format (full, simple, and flow JSON, which is compiled automatically):

superdialog generate "Book a demo call; capture day and time."   # -> playbook.yaml
superdialog chat                              # picks up ./playbook.yaml
superdialog chat --playbook booking.yaml      # explicit (any format)
superdialog chat --flow appointment.json      # flow JSON, compiled onto the engine
superdialog chat --flow appointment.json --mode flow   # legacy DialogMachine
superdialog optimize --playbook playbook.yaml  # reflective prose optimizer

Quickstart B - the legacy graph engine

The original engine: a flow graph executed as a deterministic state machine. Still driven through DialogMachine - pass engine="flow" (or --mode flow on the CLI) to select it. New agents should start with Quickstart A.

import asyncio
from superdialog import create_dialog_flow, DialogMachine, Flow

# 1. Bootstrap a flow from a prompt (one-shot LLM call at construction).
#    The build LLM is used ONLY here - never at runtime.
async def build():
    flow = await create_dialog_flow(
        prompt="Confirm appointment. Ask if Friday 4pm works; offer 5pm if not.",
        llm="openai/gpt-5.1",
    )
    flow.save("appointment.json")        # JSON, version-controllable

asyncio.run(build())

# 2. Build the runtime machine (runtime model can differ from the build model).
#    engine="flow" selects the legacy graph runtime; the default is Playbook.
dialog_machine = DialogMachine(
    Flow.load("appointment.json"),
    llm="anthropic/claude-haiku-4-5",
    engine="flow",
)

# 3. Run a conversation.
async def chat():
    reply = await dialog_machine.turn("Hi, I'm calling about my appointment.")
    print(reply.text)

asyncio.run(chat())

Or skip the Python and use the bundled CLI - legacy mode is an explicit opt-in:

superdialog chat --flow appointment.json --mode flow

Tools plug in as PythonTool / HttpTool / MCPTool; models are picked per machine with litellm-style URIs (openai/gpt-5.1, vllm/<model>@<host>, custom/<name>/<model>, …) and swapped at runtime with set_llm(uri). See docs/02-api-reference.md.

Which engine?

You are building	Use
IVR-style scripts: deterministic, graph-railed, every path enumerable	DialogMachine (legacy, opt-in via `--mode flow`)
Fluid conversations where the model owns fluidity and checkpoints gate outcomes	Playbook
Real token streaming with a compound Talker/Director turn model	Playbook
Event-sourced audit log, deterministic replay, persona-driven eval	Playbook
An existing flow JSON in production	Playbook - every loader auto-compiles flow JSON; `--mode flow` keeps legacy DialogMachine behaviour

Flows keep working; playbooks are where new investment goes. Existing flow graphs compile down losslessly:

from superdialog import Flow
from superdialog.playbook import compile_flow, coverage_report

flow = Flow.load("appointment.json")
pb = compile_flow(flow)               # ConversationFlow -> Playbook
report = coverage_report(flow, pb)    # proves every node/edge/action mapped

Deploy anywhere

DialogMachine (and the lower-level PlaybookAgent) implement the same superdialog.agent.Agent protocol (turn / assist / chat_ctx), so the same object drops into every host. The host varies; the SuperDialog code is identical.

Host	Adapter	Approx. LoC
CLI	none - `superdialog chat` (playbooks or flows) or an `input()`/`print()` loop	~5
LiveKit	`superdialog.adapters.livekit.DialogMachineLLM` (`Agent(llm=...)` plugin)	~6
PipeCat	`superdialog.adapters.pipecat.make_processor(agent)`	~2
FastAPI	direct `/turn` route, or a `SessionWorker` for multi-user	~6
Unpod Voice	`superdialog.adapters.websocket.WebSocketRunner`	~6
Slack / Discord / IRC / etc.	none - direct callback	~3

# LiveKit - same agent object, ~6 lines
from livekit.agents import Agent, AgentSession
from superdialog.adapters.livekit import DialogMachineLLM

async def entrypoint(ctx):
    agent = Agent(llm=DialogMachineLLM(playbook_agent))
    await AgentSession().start(agent=agent, room=ctx.room)

When a conversation must outlive the process, hand any agent factory to a SessionWorker - it builds one agent per session and multiplexes N concurrent sessions, serializing same-session requests via a per-session lock. LLMAgent and LangChainAgent are drop-in non-state-machine brains for the same machinery.

What it is not

Not a UI flow designer - that belongs to a downstream tool.
Not a voice framework - audio, STT, TTS are out of scope.
Not multi-modal - text only at the interface (vision/audio via tools).
Not a hosted service - a library. Hosting is offered by the Unpod Voice Platform for those who want it.

Documentation

Doc	Contents
docs/00-overview.md	What it is, why standalone, positioning
docs/01-architecture.md	Engine internals - the Playbook runtime (default) and the legacy DialogMachine; contracts, data shapes
docs/02-api-reference.md	Every class and method
docs/03-embedding-guides.md	Host-by-host integration walkthroughs
docs/04-playbook-guide.md	Part 1: authoring formats (simple + full); Part 2: technical design - compiling flows, replay/eval

Roadmap

Future work - none of this is in the current release:

Voice-event plumbing (silence, barge-in signals) from live adapters into playbook external events
Distributed session stores (Redis / File / SQLite)

License

Apache-2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.16

Jun 27, 2026

This version

0.2.15

Jun 24, 2026

0.2.14

Jun 23, 2026

0.2.13

Jun 22, 2026

0.2.12

Jun 21, 2026

0.2.11

Jun 20, 2026

0.2.10

Jun 19, 2026

0.2.9

Jun 18, 2026

0.2.8

Jun 15, 2026

0.2.7

Jun 13, 2026

0.2.6

Jun 10, 2026

0.2.5

Jun 10, 2026

0.2.4

Jun 10, 2026

0.2.3

Jun 4, 2026

0.2.2

Jun 4, 2026

0.2.1

Jun 3, 2026

0.2.0a1 pre-release

Jun 2, 2026

0.2.0a0 pre-release

Jun 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

superdialog-0.2.15.tar.gz (742.6 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

superdialog-0.2.15-py3-none-any.whl (314.5 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file superdialog-0.2.15.tar.gz.

File metadata

Download URL: superdialog-0.2.15.tar.gz
Upload date: Jun 24, 2026
Size: 742.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for superdialog-0.2.15.tar.gz
Algorithm	Hash digest
SHA256	`43a1038f349942e1538963b64d40f7b73205e83ca799a206979040fc1058a403`
MD5	`97d63a1cbd0e5af2648b36978c3b6e91`
BLAKE2b-256	`93de0ee042b8aa7fad97d75e6adfe42a9f5588648d6d8c3cf48f2e002aa2f1cc`

See more details on using hashes here.

File details

Details for the file superdialog-0.2.15-py3-none-any.whl.

File metadata

Download URL: superdialog-0.2.15-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 314.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for superdialog-0.2.15-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1c8983dabbec96fb558b0cceb7dcd16a72b29ef9a3f07913d066b8969b77e181`
MD5	`f5c5882edd46ad8d5a06bec1ecfc2726`
BLAKE2b-256	`66cf02e234058846ee13c33be5dad4ce0b81d2760cb6b059fcbed8ae2bfa005f`

See more details on using hashes here.

superdialog 0.2.15

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

SuperDialog

Why standalone

Install

Quickstart A - Playbook engine (default)

Quickstart B - the legacy graph engine

Which engine?

Deploy anywhere

What it is not

Documentation

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes