Skip to main content

Pocket TTS integration for Vision Agents - lightweight CPU-based text-to-speech

Project description

Pocket TTS Plugin

A lightweight Text-to-Speech (TTS) plugin for Vision Agents powered by Kyutai's Pocket TTS model. Runs efficiently on CPU with low latency (~200ms) and supports voice cloning.

Features

  • Runs on CPU - no GPU required
  • Small model size (100M parameters)
  • Low latency (~200ms to first audio)
  • Voice cloning support
  • Built-in voice selection

Installation

uv add vision-agents[pocket]

Usage

from vision_agents.plugins import pocket

# Create TTS with default voice
tts = pocket.TTS()

# Or specify a built-in voice
tts = pocket.TTS(voice="marius")

# Or use a custom voice for cloning
tts = pocket.TTS(voice="path/to/your/voice.wav")

Configuration

Parameter Description Values
voice Built-in voice name or path to custom wav file "alba" (default), "marius", "javert", "jean", "fantine", "cosette", "eponine", "azelma", or custom path

Built-in Voices

  • alba - Default voice
  • marius
  • javert
  • jean
  • fantine
  • cosette
  • eponine
  • azelma

Dependencies

  • pocket-tts>=0.1.0
  • PyTorch 2.5+

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_pocket-0.3.7.tar.gz (3.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vision_agents_plugins_pocket-0.3.7-py3-none-any.whl (8.6 kB view details)

Uploaded Python 3

File details

Details for the file vision_agents_plugins_pocket-0.3.7.tar.gz.

File metadata

File hashes

Hashes for vision_agents_plugins_pocket-0.3.7.tar.gz
Algorithm Hash digest
SHA256 79a0bea217e83f258534e85835c9ffc14cfef078ebcf45c34424e83ab7ef1727
MD5 daae55e022000b1242128cda96d66fdb
BLAKE2b-256 31515c45a93d72b583873f156b400c2e4511d896e694988211918c70d70749ee

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_pocket-0.3.7-py3-none-any.whl.

File metadata

File hashes

Hashes for vision_agents_plugins_pocket-0.3.7-py3-none-any.whl
Algorithm Hash digest
SHA256 0692d8e693b567dd19d927751a5cee9a541d3ede4c5fa38efb5957b25e1167c9
MD5 2e7325d6792dcc1b7f10a4eca36ff2f4
BLAKE2b-256 9e7e8c1931dc3e92486d244d32049520a4a46ea6e3fd03dd422e47150c77a5b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page