Skip to main content

Text-to-speech package with Windows audio routing support.

Project description

VoiceConductor

A Python package for generating and routing synthesized voice lines. Supports output to speakers or virtual microphones.

Entry point: TTSManager

Features

  • Provider fallback across ElevenLabs, Kokoro, Azure Speech, Windows Speech, and the built-in demo provider.
  • SQLite phrase caching so repeated lines do not need to be synthesized again.
  • Named playback routes for speakers and virtual mic devices.
  • Background playback tasks for non-blocking speech.
  • Playback lifecycle hooks for push-to-talk workflows.
  • JSON/JSONC configuration for provider credentials, default voices, route settings, and cache paths.

Requirements

  • Python 3.11 or newer.
  • Audio playback support via sounddevice.
  • Optional provider dependencies and credentials depending on the backend you want to use.

Installation

Install the package:

pip install VoiceConductor

Install with Kokoro support:

pip install "VoiceConductor[kokoro]"

For local development from a checkout:

pip install -e ".[kokoro]"

Quick Start

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("This is a test.", routes="speakers")

Route to a virtual mic:

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("Now, to the virtual microphone.", routes="mic")

Route to both speakers and mic:

tts.speak("Routed to both output devices.", routes=["speakers", "mic"])

Synthesize once, then route the resulting audio:

audio = tts.synthesize_voice("This audio sample is stored in audio.")
result = tts.route(audio, routes=["speakers", "mic"])
print(result.routes)

Configuration

By default, voice_conductor looks for one of these files in the current working directory:

  • voice_conductor.config.jsonc
  • voice_conductor.config.json

If neither file exists, defaults are used. To create a config file you can edit, save the current settings:

from voice_conductor import load_settings

settings = load_settings()
settings.save_settings("voice_conductor.config.jsonc")

Provider selection follows voice_conductor.provider_chain. When speak() or synthesize_voice() does not specify a provider, the manager uses the first available provider in that chain.

Providers

Built-in providers:

Provider Use case Availability
elevenlabs Hosted high-quality voices. Requires API key.
kokoro Local Kokoro synthesis. Requires the kokoro extra and model access.
azure Azure neural voices. Requires Speech key and region.
windows Installed Windows System.Speech voices. Requires Windows speech support.
demo Offline test voice. No external service.

List available providers:

from voice_conductor import TTSManager

tts = TTSManager()
print(tts.list_providers())

List voices for a provider:

for voice in tts.list_voices("windows"):
    print(voice.id, voice.name)

Audio Routes

Routes are named outputs. The default route names are:

  • speakers
  • mic

You can pass one route name or a list of route names:

tts.speak("Hello, world!", routes="speakers")
tts.speak("Hello, world!", routes=["speakers", "mic"])

The mic route resolves to an output device because virtual microphone tools such as VB-CABLE and VoiceMeeter expose playback endpoints that chat apps receive as microphone input. See docs/mic-setup.md for a virtual microphone setup guide.

Cache

Synthesized phrases are cached in SQLite. Cache entries are keyed by:

  • text
  • provider
  • normalized voice key
  • provider settings that affect audio output

Useful cache methods:

tts.invalidate_synthesis_cache(text="Hello, world!")
tts.invalidate_synthesis_cache(provider="elevenlabs")
tts.clear_synthesis_cache()

Pass refresh_cache=True to regenerate a phrase and replace the cached entry:

tts.speak("New take.", refresh_cache=True)

Background Playback

Use background=True when the caller should continue immediately:

task = tts.speak("Now we're not blocking the main thread.", routes="mic", background=True)
result = task.result(timeout=10)

Push-To-Talk Hooks

Playback hooks run after audio and routes are ready and after playback completes. They are useful for pressing and releasing push-to-talk around virtual mic playback.

from voice_conductor import PlaybackHooks, TTSManager

tts = TTSManager()

tts.speak(
    "Check out the playback hooks.",
    routes="mic",
    hooks=PlaybackHooks(
        on_audio_ready=lambda event: press_push_to_talk(),
        on_playback_complete=lambda event: release_push_to_talk(),
    ),
)

Documentation

  • docs/general.md: architecture, settings, providers, and shared types.
  • docs/mic-setup.md: virtual microphone setup and troubleshooting.
  • examples/: small runnable examples.

Development

Install the project locally:

pip install -e ".[kokoro]"

Run tests:

pytest

Run a focused test file:

pytest tests/test_manager.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voiceconductor-0.1.2.tar.gz (68.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voiceconductor-0.1.2-py3-none-any.whl (55.2 kB view details)

Uploaded Python 3

File details

Details for the file voiceconductor-0.1.2.tar.gz.

File metadata

  • Download URL: voiceconductor-0.1.2.tar.gz
  • Upload date:
  • Size: 68.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.2.tar.gz
Algorithm Hash digest
SHA256 c0dd08f2ce8ce7f87ca8c65d1a8219b94836129af201a0d0317ca850b035a2f5
MD5 76569d69a6ff3d2b45d45e6d03c5bece
BLAKE2b-256 1a0c584943e881cd30c988c4373322b5a8d34a95d8a889339a800474d2213c9d

See more details on using hashes here.

File details

Details for the file voiceconductor-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: voiceconductor-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 55.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 689940771e17063b2dc4fc5c75ba094be00c42a77bd16cf74b1b50284e2ae249
MD5 79045c5959c8ba67e74180850b855417
BLAKE2b-256 6e7478a2857806722fa646655e88de3c7fa3b39230d79857de55977347c4c036

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page