Skip to main content

Hands-free voice operator for Claude Code

Project description

Voxa

Voxa lets you call into your laptop from a phone browser, talk to a Gemini Live "operator," and have it drive Claude Code by voice.

MVP scope (drive mode only): pick a working directory by voice, send spoken instructions, hear Claude's final result read back. Attach mode, voice folder-browsing, and barge-in interruption are V2 backlog items (see docs/superpowers/specs/2026-06-27-loop-design.md and docs/superpowers/plans/2026-06-27-loop-mvp.md).


Prerequisites

  • Python 3.11+ on the laptop.
  • Tailscale installed and logged in on both the laptop and the phone (free personal plan is fine). The phone must be on the same tailnet as the laptop, or MagicDNS must be enabled.
  • A Gemini API key from Google AI Studio with Gemini Live access.
  • Claude Code logged in on the laptop (claude CLI authenticated). The agent SDK reuses your existing Claude Code credentials; no separate ANTHROPIC_API_KEY is needed unless you prefer to supply one.

Quickstart

Install Voxa on the laptop you want to control with one command.

macOS / Linux:

curl -fsSL https://voxa.space/install.sh | sh

Windows (PowerShell):

irm https://voxa.space/install.ps1 | iex

Prefer a package runner? These work on any OS:

npx voxa-code            # Node users
uvx voxa-code            # Python users (or: pipx install voxa-code)

Then start it:

voxa

Voxa is zero-config by default: it uses the hosted relay, so there are no API keys to set up. voxa starts the server and prints a pairing QR code. Scan it with the Voxa phone app (or open the printed URL in your phone browser) to connect.


Develop from source

Contributors who want to hack on Voxa can run it from a checkout instead of the published package.

1. Create and activate the virtual environment

python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"

2. Configure secrets

cp .env.example .env

Open .env and fill in:

Key Value
GEMINI_API_KEY Your Google AI Studio key
VOXA_AUTH_TOKEN Any random secret string (protects the WebSocket endpoint on your tailnet)

GEMINI_LIVE_MODEL, VOXA_HOST, and VOXA_PORT have sensible defaults and can be left as-is.

3. Start the server

bash scripts/serve.sh

The script:

  1. Starts the Voxa FastAPI server on 127.0.0.1:8787 (or $VOXA_PORT).
  2. Calls tailscale serve to expose it over HTTPS on your tailnet (required because the phone browser needs a secure context for microphone access).
  3. Prints the full HTTPS URL including your auth token.

4. Connect from the phone

Open the printed URL on your phone browser. Tap Connect, then speak.


Architecture (brief)

Phone browser (static/)
    |  HTTPS WebSocket (auth token required)
    v
FastAPI server (server/app.py)
    |  audio bytes (16 kHz PCM)
    v
GeminiOperator (server/gemini_operator.py)  <-->  Gemini Live API
    |  tool calls (start_claude_session, send_to_claude, …)
    v
Orchestrator (server/orchestrator.py)
    |
    v
ClaudeController (server/claude_controller.py)  -->  Claude Code (agent SDK, bypassPermissions)

Config is loaded from .env via server/config.py.


Running the test suite

.venv/bin/python -m pytest -v

Expected: 22 tests pass, no warnings.


Manual end-to-end smoke test

The smoke test requires a real phone, real Tailscale connectivity, and real API keys. Run it against a scratch directory, not a real project.

Before you start:

  • .env is fully filled in (real GEMINI_API_KEY and VOXA_AUTH_TOKEN).
  • Tailscale is running on both the laptop and the phone.
  • Claude Code is logged in on the laptop.

Procedure:

  1. Open a terminal on the laptop and run:

    bash scripts/serve.sh
    

    Wait for the line Voxa is live. On your phone open: https://...

  2. Copy the printed HTTPS URL (it already includes ?token=...).

  3. On the phone, open the URL in Safari or Chrome. You should see the Voxa interface. Grant microphone permission when prompted.

  4. Tap Connect. The button should change state to indicate an active session.

  5. Speak: "Start a session in /tmp/loop-smoke and create a file called hello.txt that says hi."

  6. Verify:

    • Gemini acknowledges the instruction verbally (you hear a response through the phone speaker).
    • On the laptop terminal you see Claude Code start with bypassPermissions active (no permission prompts appear).
    • After Claude finishes, /tmp/loop-smoke/hello.txt exists on the laptop and contains hi.
    • Gemini speaks the final result back to you.
  7. To stop: press Ctrl-C in the laptop terminal. The trap in serve.sh will kill the server and tear down tailscale serve.

Warning: Use a throwaway scratch directory (like /tmp/loop-smoke) for your first smoke test. Claude Code runs with bypassPermissions, so it will write files without asking.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxa_code-0.1.0.tar.gz (96.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voxa_code-0.1.0-py3-none-any.whl (109.6 kB view details)

Uploaded Python 3

File details

Details for the file voxa_code-0.1.0.tar.gz.

File metadata

  • Download URL: voxa_code-0.1.0.tar.gz
  • Upload date:
  • Size: 96.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for voxa_code-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b704902d3f6944c64bfe25901aa3431b3d3b72433f791bd7289d3b81415e2b13
MD5 423699a1bc2cfe4859c8135eaea809ec
BLAKE2b-256 98269d9462f9c623bbbb6a9267453c6058f0e8eafab5da17359c18b4f557f2e2

See more details on using hashes here.

File details

Details for the file voxa_code-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voxa_code-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 109.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.13

File hashes

Hashes for voxa_code-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 25e6c0f9294d13e50a9577ddcc1d5b4e6bf09cfb6c3385e869c830683607bb98
MD5 43ec7fb74cf0d1161e6472b098bffb63
BLAKE2b-256 916ef3ee7c2d2a8e0da553993438871364abfbd1a6b0d7f8e722aba2cf2b2e99

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page