Skip to main content

Pluggable voice chat runtime

Project description

TryVoice

Hands-free voice runtime for AI agents. Talk to your AI coding assistant without touching the keyboard.

TryVoice wraps AI agents (like OpenClaw and Claude Code) into a voice interface with wake word activation, push-to-talk, and real-time streaming — all running in your browser.

Early Preview (v0.1.0-alpha) — actively developed, expect rough edges.

TryVoice Demo — multi-bot voice interaction with cross-device sync

What It Does

  • Wake word activation — say a keyword to start talking, no hands needed (powered by OpenWakeWord)
  • Push-to-talk — hold a button to speak, release to send
  • Real-time streaming — hear the AI respond as it generates, with interruptible playback
  • Multi-bot slots — run multiple independent agent sessions side by side
  • Mobile-ready — PWA support, works on phone browsers
  • Pluggable adapters — connect any AI agent via the Adapter SDK

Prerequisites

TryVoice is a voice layer on top of existing AI agents. You need at least one of:

  • Claude Code — installed on the same machine (claude CLI available in PATH)
  • OpenClaw — running with a gateway endpoint

More agent adapters coming soon. See Building an Adapter to connect your own agent.

Quick Start

Option A: Install from PyPI (recommended)

pip install tryvoice
tryvoice            # Start the server and open browser
# First launch shows Setup Wizard in browser — configure adapter, TTS, etc.

Option B: Install from source

git clone https://github.com/AaronZ021/tryvoice-oss.git
cd tryvoice
bash scripts/setup.sh   # Creates venv, installs packages, builds frontend
source .venv/bin/activate
tryvoice                 # Start the server and open browser
# First launch shows Setup Wizard in browser

Configure

On first launch, the browser opens a Setup Wizard that walks you through:

  1. API Keys (optional but recommended) — enter a Groq API key for faster speech-to-text (lower latency than local Whisper), and an Azure Speech key for high-quality text-to-speech
  2. Adapter — choose Claude Code or OpenClaw and enter connection details
  3. Wake word — pick a keyword (e.g., "jarvis", "americano") for hands-free voice activation

All settings can be changed later from the in-app settings panel.

Docker

git clone https://github.com/AaronZ021/tryvoice-oss.git
cd tryvoice
docker compose up
# Open https://localhost:7860 — Setup Wizard runs on first launch

Architecture

┌─────────────┐     WebSocket      ┌──────────────────┐
│  Browser UI  │◄──────────────────►│   TryVoice       │
│  (PWA)       │                    │   Runtime         │
│              │                    │                   │
│  Wake Word   │                    │  ┌────────────┐   │
│  STT / TTS   │                    │  │  Adapter    │   │──► Claude Code
│  Audio I/O   │                    │  │  Registry   │   │──► OpenClaw
│              │                    │  │  (plugin)   │   │──► Your adapter
└─────────────┘                    └──┴────────────┴───┘

Voice flow: Wake word / PTT → STT (browser Web Speech API or Groq Whisper) → Adapter → Agent → Streaming text → TTS (Edge TTS) → Audio playback

Configuration

Variable Default Description
TRYVOICE_ACTIVE_ADAPTER echo Active adapter (claude-code, openclaw, or custom)
GROQ_API_KEY Groq API key for server-side STT (optional, browser fallback)
EDGE_TTS_VOICE zh-CN-XiaoxiaoNeural Edge TTS voice (300+ voices available)
PORT 7860 Server port

See .env.example for all options, or run tryvoice --setup for an interactive wizard.

Built-in Adapters

Adapter Use Case
claude-code Voice control for Claude Code terminal sessions
openclaw Voice interface to OpenClaw agent gateway
echo Testing and demo (echoes your speech back)

Building an Adapter

Connect TryVoice to any AI agent by implementing the Adapter protocol:

from backend.adapter_sdk import AdapterCapabilities, AdapterEvent

class MyAdapter:
    def report_capabilities(self) -> AdapterCapabilities:
        return AdapterCapabilities(supports_stream=True, ...)

    async def stream_user_turn(self, session_key, text, ...):
        # Call your agent, yield AdapterEvent chunks
        yield AdapterEvent(kind="token", text="Hello!")
        yield AdapterEvent(kind="turn_end")

Register via entry point in pyproject.toml:

[project.entry-points."tryvoice.adapters"]
my-agent = "my_package.adapter:MyAdapter"

Development

Prerequisites

  • Python 3.9+ (3.11 recommended)
  • Node.js 20+ (for frontend build)

Setup

git clone https://github.com/AaronZ021/tryvoice-oss.git
cd tryvoice
bash scripts/setup.sh
source .venv/bin/activate
tryvoice

Project structure

tryvoice/
├── apps/
│   ├── host-runtime/      # Python FastAPI backend (adapter layer, session FSM, voice providers)
│   └── client-web/        # TypeScript frontend (Vite, state machine, wake word, audio)
├── scripts/               # Setup and build scripts
├── pyproject.toml          # Python package config
├── Dockerfile              # Multi-stage build (Node + Python)
└── docker-compose.yml      # Single-command deployment

License

Apache License 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tryvoice-0.1.3.tar.gz (61.0 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tryvoice-0.1.3-py3-none-any.whl (61.1 MB view details)

Uploaded Python 3

File details

Details for the file tryvoice-0.1.3.tar.gz.

File metadata

  • Download URL: tryvoice-0.1.3.tar.gz
  • Upload date:
  • Size: 61.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tryvoice-0.1.3.tar.gz
Algorithm Hash digest
SHA256 8f8c0a93954b364727e7eb0ded1cd5b9b39d3830f9f03e957fdaccd8d6e8892a
MD5 831b92f558ed966c8855fd150df2770d
BLAKE2b-256 cb661876c49132323d8362dcbf72858178642423826d47e9b0dee1a90185a554

See more details on using hashes here.

File details

Details for the file tryvoice-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: tryvoice-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 61.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.6

File hashes

Hashes for tryvoice-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 ddbe44d243ab80ce373e78eccfb35bc3f29b35fd100bd932a051a4fdc450ba6c
MD5 db94a5422e7ec6167c160fc7ce65322f
BLAKE2b-256 b4d2a8274b4b84f19b2dfe4d616bc449e99c37613fc2a322dece324e633a3295

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page