Real-time voice service for persona-core — WebRTC transport via LiveKit OSS Server.

These details have not been verified by PyPI

Project links

Project description

persona-voice

Real-time voice service for Open Persona — WebRTC transport via LiveKit OSS. Source-available; noncommercial use only.

Status: PolyForm Noncommercial 1.0.0 · Source Available (Noncommercial Use Only) · V1–V3 shipped; V4–V6 pending

What it is

persona-voice is the real-time voice trunk: a LiveKit OSS substrate, a WebRTC transport facade, a session lifecycle state machine, a streaming-loop skeleton with V2 / V3 / V4 / V5 Protocol seams, per-user advisory-lock concurrency, and a structured VoiceLog. It runs in-process with persona-core (no separate language, no cross-process IPC) so the typed-memory stores, audit log, and credits service compose directly.

Shipped sub-trunks:

V1 — WebRTC transport (persona-voice 0.1.0) — LiveKit OSS substrate via livekit>=1.1, POST /v1/voice/token JWT-authed AccessToken endpoint, VoiceRoom facade with inbound resample to canonical PCM16 mono 16 kHz + outbound 24 kHz publish, Session state machine, per-user voice-call concurrency via pg_try_advisory_xact_lock, full-duplex binary criterion proven on live LiveKit Server.
V2 — Streaming STT (persona-voice 0.V2.0) — provider-independent StreamingSTT Protocol mirroring Spec 02 ChatBackend, Deepgram Nova-3 concrete backend, Silero VAD ONNX-only adapter with mandatory SileroFramer reframer, V1 STTStream seam adapter, VoiceLog extended with 4 additive STT fields (stt_partial_first_at, stt_audio_pushed_at, stt_provider_cost_cents_per_minute, stt_total_cents), content-hash-only audit, PERSONA_STT_* env block.
V3 — Streaming TTS — provider-independent StreamingTTS Protocol, Cartesia concrete backend (cartesia[websockets]>=3,<4), voice resolution from persona schema, V1 outbound-rail seam adapter, mid- utterance cancel() with discard-on-cancel for the future V4 barge-in foundation, in-process integration spine through STT → mocked-V5 → TTS → outbound.

Not yet shipped (sub-trunks in research / planning):

V4 — Turn-taking + barge-in — interrupt handling, end-of-utterance detection, lifecycle hooks on SessionEventListener.
V5 — Model reply producer — streams runtime token output into V3 with the canonical first-token-latency measurement convention.
V6 — Frontend voice experience — the browser-side audio plumbing and UI in persona-web.

Install

From PyPI (planned, once V4–V6 close):

pip install persona-voice

Workspace development:

git clone https://github.com/yasinhessnawi1/Open-Persona.git
cd open-persona
uv sync --all-packages

Prerequisites for V2 / V3 wire behaviour: a Deepgram API key (PERSONA_STT_API_KEY) and a Cartesia API key (PERSONA_TTS_API_KEY). The in-process integration spines run without real provider connectivity.

Run

persona-voice is a service consumed by persona-api; there is no standalone CLI. The token-issuance HTTP app boots from persona_voice.http.app:

uv run uvicorn persona_voice.http.app:create_app --factory --port 8001

You also need a running LiveKit OSS Server (see docker-compose.yml) and the persona-api (POST /v1/voice/token is the production entry point; the persona-voice route exists for development).

Test

uv run pytest packages/voice                            # unit (default)
uv run pytest packages/voice -m integration             # live LiveKit + Postgres
uv run pytest packages/voice -m external                # live Deepgram / Cartesia
uv run mypy packages/voice/src
uv run ruff check packages/voice

The integration tests bring up a real LiveKit Server and prove full- duplex (V1), STT pipe (V2), and end-to-end TTS through the V1 outbound rail (V3). External smoke tests are skipped unless the provider key env vars are set.

Architecture role

persona-voice sits beside persona-runtime as a sibling consumer of persona-core. It does not depend on persona-runtime and is not imported by it — voice routes through the API which composes both. The voice trunk owns: the LiveKit substrate, audio frame plumbing, the streaming STT and TTS Protocols + concrete backends, the session lifecycle, voice-call concurrency, and the additive VoiceLog. Per- minute billing, V4 turn-taking, V5 model wiring, and the V6 frontend land post-V3.

Contribute

Contributions welcome under the same PolyForm Noncommercial 1.0.0 license. The package is source-available for noncommercial use; commercial use requires a separate license — contact the rights holder. Issues and pull requests welcome at github.com/yasinhessnawi1/Open-Persona. See CHANGELOG.md for the spec-by-spec history.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jun 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

persona_voice-0.1.0.tar.gz (145.2 kB view details)

Uploaded Jun 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

persona_voice-0.1.0-py3-none-any.whl (97.6 kB view details)

Uploaded Jun 13, 2026 Python 3

File details

Details for the file persona_voice-0.1.0.tar.gz.

File metadata

Download URL: persona_voice-0.1.0.tar.gz
Upload date: Jun 13, 2026
Size: 145.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.9

File hashes

Hashes for persona_voice-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7cb5213f481a96cecb859c8e5b3eba28f631d6f779473616a618623a38960726`
MD5	`9a4aa78acfd270071db567827bc8c19b`
BLAKE2b-256	`e3d5cad7e86a815af58bb8d642730a661809fc1b14f480c2e82c829482fcf128`

See more details on using hashes here.

File details

Details for the file persona_voice-0.1.0-py3-none-any.whl.

File metadata

Download URL: persona_voice-0.1.0-py3-none-any.whl
Upload date: Jun 13, 2026
Size: 97.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.9

File hashes

Hashes for persona_voice-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8f059a955d74ac0b43bb9a6f2465c0bffec9d3ef67de49e92c2d850a191ae741`
MD5	`8fe528ad7a7b1eb96329ede54cfede2d`
BLAKE2b-256	`8a8bbef392a6494d25599e4b22e0293b090f733e9aa5208c98325bbebec5d3ac`

See more details on using hashes here.

persona-voice 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

persona-voice

What it is

Install

Run

Test

Architecture role

Contribute

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes