Pipecat for the edge — edge-native, local-first real-time voice conversation library

These details have not been verified by PyPI

Project description

voxedge

English | 中文

voxedge banner

Native TensorRT · RKNN · sherpa-onnx voice pipelines for Jetson, Rockchip, and Raspberry Pi — fully on-device, verified on real hardware, zero cloud.

What is voxedge?

voxedge is an embeddable Python library that drives real-time, on-device voice conversations by calling directly into each platform's native inference runtime — TensorRT on Jetson Orin, RKNN on RK3576/RK3588, sherpa-onnx on CPU. No cloud STT/TTS APIs, no internet at runtime, no intermediate abstraction overhead. The same ConversationEngine API works across all three backends; you swap only the backend constructor — N=2 concurrent sessions verified on Orin Nano 8 GB, byte-identical output, zero CUDA errors.

voxedge is the open-core engine behind OpenVoiceStream — the deployable FastAPI/WebSocket server, device profiles, and agent gallery. Want a container? Start there. Want to embed real-time edge voice in your own app? You're in the right place.

Key Features

Native runtimes, full performance — calls directly into TensorRT (Jetson), RKNN (Rockchip), and sherpa-onnx (CPU); no wrapper overhead, no cross-platform abstraction tax
Fully on-device — no speech API key, no per-call bill, no internet dependency at runtime
Verified on real hardware — N=2 concurrent sessions on Orin Nano 8 GB: byte-identical output vs. single-stream, zero CUDA errors
Streaming + barge-in — partial + final ASR while the user speaks; sentence-level TTS streaming with first-audio latency low enough for live dialogue and cooperative barge-in
Swap hardware, not code — same ConversationEngine API across Jetson, Rockchip, and sherpa-onnx CPU; only the backend constructor changes
Test on any machine — mock backends require only numpy; the whole engine runs end-to-end on a Mac with no CUDA or GPU

Quickstart

Runs on any machine — no GPU needed. Swap the backend constructors for a real device; the engine, transport, and event contract never change.

pip install voxedge

import asyncio
from voxedge.engine import ConversationEngine
from voxedge.transport import InProcessTransport
from voxedge.backends.mock import MockASR, MockTTS, MockVAD

engine = ConversationEngine(
    backends={"asr": MockASR(transcript="hello world"), "tts": MockTTS(), "vad": MockVAD()},
    multi_utterance=True,
)

async def main():
    t = InProcessTransport()
    await t.feed_audio(b"\x01\x02" * 8000)   # speech frames (int16 PCM)
    await t.feed_audio(b"\x00\x00" * 8000)   # silence → VAD endpoints the utterance
    t.end_input()
    await engine.run(t)                       # drives ASR → (LLM) → TTS
    for ev in t.drain_events_nowait():        # asr_final / tts_* / ...
        print(ev["type"], ev.get("text", ""))

asyncio.run(main())

On a real device, swap only the backend constructors — everything else is identical:

# Jetson Orin — pip install voxedge[jetson]
from voxedge.backends.jetson import (
    TRTEdgeLLMASRBackend, TRTEdgeLLMASRConfig,
    TRTEdgeLLMTTSBackend, TRTEdgeLLMTTSConfig,
)

engine = ConversationEngine(backends={
    "asr": TRTEdgeLLMASRBackend(TRTEdgeLLMASRConfig(...)),   # Qwen3-ASR, native TRT
    "tts": TRTEdgeLLMTTSBackend(TRTEdgeLLMTTSConfig(...)),   # Qwen3-TTS, streaming
}, multi_utterance=True)

import voxedge is numpy-only — TensorRT, RKNN, and sherpa-onnx are lazy-imported by their backend adapters and pulled in via extras. The example above imports cleanly on a Mac even though the TRT engine only runs on a Jetson.

Install

pip install voxedge            # pure-Python core (numpy only)
pip install voxedge[sherpa]    # sherpa-onnx CPU ASR/TTS
pip install voxedge[jetson]    # Jetson TensorRT backends (aarch64)
pip install voxedge[rk]        # Rockchip RK3576/RK3588 NPU (aarch64)
pip install voxedge[llm]       # OpenAI-compatible LLM backend (httpx)

The jetson / rk extras declare only pure-Python deps; the CUDA/TensorRT and RKNN runtime wheels ship from the platform (JetPack L4T / Rockchip NPU userspace) or the engine repos — you bring the platform runtime.

Architecture

Four layers, all importable without CUDA.

Backends (`voxedge/backends/`)

Clean ABCs in backends/base.py — every constructor takes explicit params only, no env coupling:

ASRBackend / ASRStream — streaming recognition
TTSBackend — synthesize() (batch) + generate_streaming() (sentence-level chunks, cooperative cancel via cancel_token for barge-in)
VADBackend / VADSession — voice-activity detection for speech / barge-in segmentation
LLMBackend / LLMEvent — token-streaming LLM for the conversation loop

Concrete adapters live under backends/{jetson,rk,sherpa}/ and import their heavy runtimes lazily (inside methods), so all modules import on any machine:

Backend	Platform	Models	Extra	Source engine
`backends/jetson/`	Jetson Orin (TensorRT)	Qwen3-ASR/TTS, Matcha, Kokoro, Paraformer, SenseVoice, MOSS-TTS-Nano	`voxedge[jetson]` aarch64	jetson-voice-engine
`backends/rk/`	Rockchip RK3576/RK3588 (RKNN)	Qwen3-ASR, Matcha, Piper, Kokoro, Paraformer, SenseVoice	`voxedge[rk]` aarch64	rkvoice-stream
`backends/sherpa/`	CPU (any arch)	Paraformer, Zipformer, SenseVoice, Matcha, Kokoro ONNX	`voxedge[sherpa]`	—
`backends/llm/`	Any	OpenAI-compatible LLM over httpx	`voxedge[llm]`	—
`backends/mock.py`	Dev / CI	MockASR, MockTTS, MockVAD, MockLLM	core	—

Transport (`voxedge/transport/`)

Transport ABC + two implementations:

InProcessTransport — zero-IPC asyncio queues; default, used everywhere in tests
WebSocketTransport — duck-typed ws adapter with no FastAPI dependency; idle-watchdog timeout injected by caller, reads no env

Conversation Engine (`voxedge/engine/`)

ConversationEngine + per-connection Session coordinator, split into focused collaborators: audio_dispatcher (VAD → speech / barge-in), asr_loop, client_events, tts_sequencer / tts_buffer, session_state, and the LLM↔tool loop — llm_turn over the provider-agnostic turn_driver.run_turn pump, with tool_registry (@tool → JSON schema) and coordinator / concurrency_capability for multi-stream concurrency.

Capabilities (`voxedge/capabilities/`)

Optional, default-off, stateless add-ons (punctuation, speaker embedding) via sherpa-onnx. Opt in explicitly; byte-level no-op when off.

Design Constraints

Pure Python core — import voxedge is numpy-only. Heavy adapters live under backends/{jetson,rk,sherpa}/ with deferred runtime imports.
No env reads in the library — all config injected as explicit params. Profiles and deployment knobs are the product's job (OpenVoiceStream).

Status

In production — the open-core engine behind a shipped edge voice stack. ~270 mock-based tests; the whole engine runs end-to-end on a Mac with no CUDA.

Contributing

Issues and PRs welcome. The mock backend suite runs on any machine with no hardware:

pip install voxedge
uv run pytest

Ecosystem

voxedge is one layer in a family of repos:

Repo	Role	When to go there
voxedge (this repo)	Embeddable Python engine	Embedding real-time voice in your own app
openvoicestream	Deployable FastAPI/WebSocket server, Docker profiles, agent gallery	Deployed use-cases and end-to-end demos; ready-to-run containers
rkvoice-stream	Rockchip NPU engine (`backends/rk/` wraps this)	RK3576/RK3588 model formats, RKNN perf numbers, TTS/ASR backend internals
jetson-voice-engine	Jetson TensorRT build scripts, model export, artifacts (`backends/jetson/` wraps this)	Jetson model conversion, TRT engine build, Orin-specific optimisations

Acknowledgements

sherpa-onnx — CPU ASR/TTS runtime
OpenVoiceStream — the deployable server product built on this engine

License

Apache-2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.0.2a0 pre-release

Jun 23, 2026

0.0.1a0 pre-release

Jun 3, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voxedge-0.0.2a0.tar.gz (205.9 kB view details)

Uploaded Jun 23, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

voxedge-0.0.2a0-py3-none-any.whl (234.5 kB view details)

Uploaded Jun 23, 2026 Python 3

File details

Details for the file voxedge-0.0.2a0.tar.gz.

File metadata

Download URL: voxedge-0.0.2a0.tar.gz
Upload date: Jun 23, 2026
Size: 205.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for voxedge-0.0.2a0.tar.gz
Algorithm	Hash digest
SHA256	`5e21f5419a560997c112c0141e3c05b50ad6f28a8db0c3cf7e139f277160fbe5`
MD5	`fd5f9bfd909b765018f25da2d5d173b9`
BLAKE2b-256	`04eeb2ed6312c12996072b3fbb9016e77009c84d1555042bd9e9d0fe1330dac7`

See more details on using hashes here.

File details

Details for the file voxedge-0.0.2a0-py3-none-any.whl.

File metadata

Download URL: voxedge-0.0.2a0-py3-none-any.whl
Upload date: Jun 23, 2026
Size: 234.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.13

File hashes

Hashes for voxedge-0.0.2a0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`619c322d9479a6b5a365a5224c4a24fb9637edc42b06dfa2c984b43063c58fdd`
MD5	`929db38977e53f5bfea1c8e345fe3a4f`
BLAKE2b-256	`9044062a9e140cbc846cd71fc85e28cfe11286eb72adf4220ce2617c082778ca`

See more details on using hashes here.

voxedge 0.0.2a0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

voxedge

What is voxedge?

Key Features

Quickstart

Install

Architecture

Backends (`voxedge/backends/`)

Transport (`voxedge/transport/`)

Conversation Engine (`voxedge/engine/`)

Capabilities (`voxedge/capabilities/`)

Design Constraints

Status

Contributing

Ecosystem

Acknowledgements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

voxedge 0.0.2a0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

voxedge

What is voxedge?

Key Features

Quickstart

Install

Architecture

Backends (voxedge/backends/)

Transport (voxedge/transport/)

Conversation Engine (voxedge/engine/)

Capabilities (voxedge/capabilities/)

Design Constraints

Status

Contributing

Ecosystem

Acknowledgements

License

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Backends (`voxedge/backends/`)

Transport (`voxedge/transport/`)

Conversation Engine (`voxedge/engine/`)

Capabilities (`voxedge/capabilities/`)