Hands-free voice operator for Claude Code
Project description
Voxa
Voxa lets you call into your laptop from a phone browser, talk to a Gemini Live "operator," and have it drive Claude Code by voice.
MVP scope (drive mode only): pick a working directory by voice, send spoken
instructions, hear Claude's final result read back. Attach mode, voice
folder-browsing, and barge-in interruption are V2 backlog items (see
docs/superpowers/specs/2026-06-27-loop-design.md and
docs/superpowers/plans/2026-06-27-loop-mvp.md).
Prerequisites
- Python 3.11+ on the laptop.
- Tailscale installed and logged in on both the laptop and the phone (free personal plan is fine). The phone must be on the same tailnet as the laptop, or MagicDNS must be enabled.
- A Gemini API key from Google AI Studio with Gemini Live access.
- Claude Code logged in on the laptop (
claudeCLI authenticated). The agent SDK reuses your existing Claude Code credentials; no separateANTHROPIC_API_KEYis needed unless you prefer to supply one.
Quickstart
Install Voxa on the laptop you want to control with one command.
macOS / Linux:
curl -fsSL https://voxa.space/install.sh | sh
Windows (PowerShell):
irm https://voxa.space/install.ps1 | iex
Prefer a package runner? These work on any OS:
npx voxa-code # Node users
uvx voxa-code # Python users (or: pipx install voxa-code)
Then start it:
voxa
Voxa is zero-config by default: it uses the hosted relay, so there are no API
keys to set up. voxa starts the server and prints a pairing QR code. Scan it
with the Voxa phone app (or open the printed URL in your phone browser) to
connect.
Develop from source
Contributors who want to hack on Voxa can run it from a checkout instead of the published package.
1. Create and activate the virtual environment
python3 -m venv .venv
.venv/bin/pip install -e ".[dev]"
2. Configure secrets
cp .env.example .env
Open .env and fill in:
| Key | Value |
|---|---|
GEMINI_API_KEY |
Your Google AI Studio key |
VOXA_AUTH_TOKEN |
Any random secret string (protects the WebSocket endpoint on your tailnet) |
GEMINI_LIVE_MODEL, VOXA_HOST, and VOXA_PORT have sensible defaults and
can be left as-is.
3. Start the server
bash scripts/serve.sh
The script:
- Starts the Voxa FastAPI server on
127.0.0.1:8787(or$VOXA_PORT). - Calls
tailscale serveto expose it over HTTPS on your tailnet (required because the phone browser needs a secure context for microphone access). - Prints the full HTTPS URL including your auth token.
4. Connect from the phone
Open the printed URL on your phone browser. Tap Connect, then speak.
Architecture (brief)
Phone browser (static/)
| HTTPS WebSocket (auth token required)
v
FastAPI server (server/app.py)
| audio bytes (16 kHz PCM)
v
GeminiOperator (server/gemini_operator.py) <--> Gemini Live API
| tool calls (start_claude_session, send_to_claude, …)
v
Orchestrator (server/orchestrator.py)
|
v
ClaudeController (server/claude_controller.py) --> Claude Code (agent SDK, bypassPermissions)
Config is loaded from .env via server/config.py.
Running the test suite
.venv/bin/python -m pytest -v
Expected: 22 tests pass, no warnings.
Manual end-to-end smoke test
The smoke test requires a real phone, real Tailscale connectivity, and real API keys. Run it against a scratch directory, not a real project.
Before you start:
.envis fully filled in (realGEMINI_API_KEYandVOXA_AUTH_TOKEN).- Tailscale is running on both the laptop and the phone.
- Claude Code is logged in on the laptop.
Procedure:
-
Open a terminal on the laptop and run:
bash scripts/serve.shWait for the line
Voxa is live. On your phone open: https://... -
Copy the printed HTTPS URL (it already includes
?token=...). -
On the phone, open the URL in Safari or Chrome. You should see the Voxa interface. Grant microphone permission when prompted.
-
Tap Connect. The button should change state to indicate an active session.
-
Speak: "Start a session in
/tmp/loop-smokeand create a file calledhello.txtthat says hi." -
Verify:
- Gemini acknowledges the instruction verbally (you hear a response through the phone speaker).
- On the laptop terminal you see Claude Code start with
bypassPermissionsactive (no permission prompts appear). - After Claude finishes,
/tmp/loop-smoke/hello.txtexists on the laptop and containshi. - Gemini speaks the final result back to you.
-
To stop: press Ctrl-C in the laptop terminal. The
trapinserve.shwill kill the server and tear downtailscale serve.
Warning: Use a throwaway scratch directory (like /tmp/loop-smoke) for
your first smoke test. Claude Code runs with bypassPermissions, so it will
write files without asking.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voxa_code-0.1.0.tar.gz.
File metadata
- Download URL: voxa_code-0.1.0.tar.gz
- Upload date:
- Size: 96.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b704902d3f6944c64bfe25901aa3431b3d3b72433f791bd7289d3b81415e2b13
|
|
| MD5 |
423699a1bc2cfe4859c8135eaea809ec
|
|
| BLAKE2b-256 |
98269d9462f9c623bbbb6a9267453c6058f0e8eafab5da17359c18b4f557f2e2
|
File details
Details for the file voxa_code-0.1.0-py3-none-any.whl.
File metadata
- Download URL: voxa_code-0.1.0-py3-none-any.whl
- Upload date:
- Size: 109.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25e6c0f9294d13e50a9577ddcc1d5b4e6bf09cfb6c3385e869c830683607bb98
|
|
| MD5 |
43ec7fb74cf0d1161e6472b098bffb63
|
|
| BLAKE2b-256 |
916ef3ee7c2d2a8e0da553993438871364abfbd1a6b0d7f8e722aba2cf2b2e99
|