Sub-second local voice AI for robots and edge devices.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

vietanhdev

These details have not been verified by PyPI

Project links

Documentation

Project description

EdgeVox

Sub-second local voice AI for robots and edge devices.

No cloud APIs. No internet after setup. Fully private. Powered by Gemma 4.

    ______    __         _    __
   / ____/___/ /___ ____| |  / /___  _  __
  / __/ / __  / __ `/ _ \ | / / __ \| |/_/
 / /___/ /_/ / /_/ /  __/ |/ / /_/ />  <
/_____/\__,_/\__, /\___/|___/\____/_/|_|
            /____/

Stack: Silero VAD -> faster-whisper (STT) -> Gemma 4 E2B IT via llama.cpp (LLM) -> Kokoro 82M (TTS)

Tested latency: 0.80s end-to-end on RTX 3080 (STT 0.40s + LLM 0.33s + TTS 0.08s)

Features

Streaming pipeline — speaks first sentence while LLM generates the rest
Interrupt support — speak while bot is talking to cut it off
Wake word detection — "Hey Jarvis" / "Lily" (optional, via OpenWakeWord)
Beautiful TUI — ASCII logo, sparkline waveform, latency history, GPU/RAM monitor, model info panel
ROS2 bridge — publishes STT/TTS/state to ROS2 topics for robotics integration
Slash commands — /reset, /lang, /voice, /say, /mictest, /model in the TUI
Chat export — Ctrl+S to save conversation as markdown
15 languages — English, Vietnamese, French, Spanish, Hindi, Italian, Portuguese, Japanese, Chinese, Korean, German, Thai, Russian, Arabic, Indonesian
Auto-detects hardware — GPU layers, model size, STT model

Hardware Requirements

Device	RAM	GPU	Expected Latency
PC (i9 + RTX 3080 16GB)	64GB	CUDA	~0.8s
Jetson Orin Nano	8GB	CUDA	~1.5-2s
MacBook Air M1	8GB	Metal	~2-3s
Any modern laptop	16GB+	CPU only	~2-4s

Quick Start

# 1. Install uv (fast Python package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Create virtual environment
uv venv --python 3.12
source .venv/bin/activate

# 3. Install llama-cpp-python with CUDA (prebuilt wheels)
uv pip install llama-cpp-python \
    --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cu124

# For Apple Silicon (Metal):
# CMAKE_ARGS="-DGGML_METAL=on" uv pip install llama-cpp-python

# For CPU only:
# uv pip install llama-cpp-python

# 4. Install EdgeVox
uv pip install -e .

# 5. Download all models (~3GB total)
edgevox-setup

# 6. Run!
edgevox

Usage

# TUI mode (default, recommended)
edgevox

# With wake word
edgevox --wakeword "hey jarvis"

# With ROS2 bridge (for robotics)
edgevox --ros2

# CLI mode (simpler, no TUI)
edgevox-cli

# Text mode (no microphone)
edgevox-cli --text-mode

# Custom options
edgevox \
    --whisper-model large-v3-turbo \
    --voice am_adam \
    --language en

TUI Controls

Key	Action
`Q`	Quit
`R`	Reset conversation
`M`	Mute/Unmute mic
`/`	Open command input
`Ctrl+S`	Export chat to markdown

Slash Commands

Command	Action
`/reset`	Reset conversation
`/lang XX`	Switch language (en, vi, fr, ko, ...)
`/langs`	List all supported languages
`/say TEXT`	TTS preview — speak text directly
`/mictest`	Record 3s + playback to test audio
`/model SIZE`	Switch Whisper model (small/medium/large-v3-turbo)
`/voice XX`	Switch TTS voice
`/voices`	List available voices
`/export`	Export chat to markdown
`/mute`	Mute microphone
`/unmute`	Unmute microphone
`/help`	Show all commands

ROS2 Integration

EdgeVox can publish voice pipeline events to ROS2 topics, making it easy to add voice interaction to any robot.

# Install with ROS2 support
uv pip install -e ".[ros2]"

# Run with ROS2 bridge
edgevox --ros2

Published Topics

Topic	Type	Description
`/edgevox/transcription`	`std_msgs/String`	User's speech (STT output)
`/edgevox/response`	`std_msgs/String`	Bot's response text
`/edgevox/state`	`std_msgs/String`	Pipeline state (listening, thinking, speaking)
`/edgevox/audio_level`	`std_msgs/Float32`	Mic level (0.0-1.0)
`/edgevox/metrics`	`std_msgs/String`	JSON latency metrics

Subscribed Topics

Topic	Type	Description
`/edgevox/tts_request`	`std_msgs/String`	Send text for the bot to speak
`/edgevox/command`	`std_msgs/String`	Commands: reset, mute, unmute

Example: Robot Integration

import rclpy
from std_msgs.msg import String

# Listen to what the user says
node.create_subscription(String, '/edgevox/transcription', on_user_speech, 10)

# Make the robot say something
pub = node.create_publisher(String, '/edgevox/tts_request', 10)
msg = String()
msg.data = "I detected an obstacle ahead."
pub.publish(msg)

Architecture

                        EdgeVox Pipeline
 +-----------+     +------------+     +----------------+
 | Microphone|---->| Silero VAD |---->| faster-whisper  |
 |           |     | (32ms)     |     | (STT)          |
 +-----------+     +------------+     +--------+-------+
                                               |
                                               v
                                      +----------------+
                                      | Gemma 4 E2B IT |
                                      | (streaming)    |
                                      +--------+-------+
                                               | sentence by sentence
                                               v
 +-----------+     +------------+     +----------------+
 |  Speaker  |<----| Kokoro 82M |<----| Sentence       |
 |           |     | (TTS)      |     | Splitter       |
 +-----------+     +------------+     +----------------+
                         |
                         v (optional)
                   +------------+
                   | ROS2 Bridge|----> /edgevox/* topics
                   +------------+

Model Sizes

Component	Model	Size	RAM
VAD	Silero VAD v6	~2MB	~10MB
STT	whisper-small	500MB	~600MB
STT	whisper-large-v3-turbo	1.5GB	~2GB
LLM	Gemma 4 E2B IT Q4_K_M	1.8GB	~2.5GB
TTS	Kokoro 82M	200MB	~300MB
Wake	OpenWakeWord	~2MB	~10MB

M1 Air (8GB): whisper-small + Q4_K_M = 3.4GB PC with GPU: whisper-large-v3-turbo + Q4_K_M = 5.8GB

Documentation

Full docs: EdgeVox Docs (built with VitePress)

cd website && npm run dev

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

vietanhdev

These details have not been verified by PyPI

Project links

Documentation

Release history Release notifications | RSS feed

1.1.2

Apr 18, 2026

1.1.1

Apr 18, 2026

1.1.0

Apr 18, 2026

This version

0.1.2

Apr 11, 2026

0.1.1

Apr 11, 2026

0.1.0

Apr 11, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgevox-0.1.2.tar.gz (128.2 kB view details)

Uploaded Apr 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

edgevox-0.1.2-py3-none-any.whl (136.5 kB view details)

Uploaded Apr 11, 2026 Python 3

File details

Details for the file edgevox-0.1.2.tar.gz.

File metadata

Download URL: edgevox-0.1.2.tar.gz
Upload date: Apr 11, 2026
Size: 128.2 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for edgevox-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`aa0bc56d539e38030820faa9bbedb5401f372c191b40414616c3ebcbfae6efa6`
MD5	`8f4ba7f2f811d352fc145f308e26340f`
BLAKE2b-256	`b5a25ffa00321ff01d1ae8c708c57670c9c2c816e430a45e386d8bba837bf9c9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgevox-0.1.2.tar.gz:

Publisher: release.yml on vietanhdev/edgevox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: edgevox-0.1.2.tar.gz
- Subject digest: aa0bc56d539e38030820faa9bbedb5401f372c191b40414616c3ebcbfae6efa6
- Sigstore transparency entry: 1278103131
- Sigstore integration time: Apr 11, 2026
Source repository:
- Permalink: vietanhdev/edgevox@0c7f4e95ba997064866f7c0fd0fe3b5f0a090cfa
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/vietanhdev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0c7f4e95ba997064866f7c0fd0fe3b5f0a090cfa
- Trigger Event: release

File details

Details for the file edgevox-0.1.2-py3-none-any.whl.

File metadata

Download URL: edgevox-0.1.2-py3-none-any.whl
Upload date: Apr 11, 2026
Size: 136.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for edgevox-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6ad72590bff8e6998d18b063ba2b4b83204cbebf1d1828f57b2af6d1bc731d13`
MD5	`63f03e941c17ceb665f4cab40917963e`
BLAKE2b-256	`387075e866116c349dac1403b16ec035c0d3805779ad0b9b616ba5b1322013f2`

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgevox-0.1.2-py3-none-any.whl:

Publisher: release.yml on vietanhdev/edgevox

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: edgevox-0.1.2-py3-none-any.whl
- Subject digest: 6ad72590bff8e6998d18b063ba2b4b83204cbebf1d1828f57b2af6d1bc731d13
- Sigstore transparency entry: 1278103158
- Sigstore integration time: Apr 11, 2026
Source repository:
- Permalink: vietanhdev/edgevox@0c7f4e95ba997064866f7c0fd0fe3b5f0a090cfa
- Branch / Tag: refs/tags/v0.1.2
- Owner: https://github.com/vietanhdev
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@0c7f4e95ba997064866f7c0fd0fe3b5f0a090cfa
- Trigger Event: release

edgevox 0.1.2

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

EdgeVox

Features

Hardware Requirements

Quick Start

Usage

TUI Controls

Slash Commands

ROS2 Integration

Published Topics

Subscribed Topics

Example: Robot Integration

Architecture

Model Sizes

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance