Skip to main content

Importable text-to-speech package with Windows audio routing support.

Project description

VoiceConductor

A Python package for generating and routing synthesized voice lines. Supports output to speakers or virtual microphones.

Entry point: TTSManager

Features

  • Provider fallback across ElevenLabs, Kokoro, Azure Speech, Windows Speech, and the built-in demo provider.
  • SQLite phrase caching so repeated lines do not need to be synthesized again.
  • Named playback routes for speakers and virtual mic devices.
  • Background playback tasks for non-blocking speech.
  • Playback lifecycle hooks for push-to-talk workflows.
  • JSON/JSONC configuration for provider credentials, default voices, route settings, and cache paths.

Requirements

  • Python 3.11 or newer.
  • Audio playback support via sounddevice.
  • Optional provider dependencies and credentials depending on the backend you want to use.

Installation

Install the package:

pip install VoiceConductor

Install with Kokoro support:

pip install "VoiceConductor[kokoro]"

For local development from a checkout:

pip install -e ".[kokoro]"

Quick Start

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("This is a test.", routes="speakers")

Route to a virtual mic:

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("Now, to the virtual microphone.", routes="mic")

Route to both speakers and mic:

tts.speak("Routed to both output devices.", routes=["speakers", "mic"])

Synthesize once, then route the resulting audio:

audio = tts.synthesize_voice("This audio sample is stored in audio.")
result = tts.route(audio, routes=["speakers", "mic"])
print(result.routes)

Configuration

By default, voice_conductor looks for one of these files in the current working directory:

  • voice_conductor.config.jsonc
  • voice_conductor.config.json

If neither file exists, defaults are used. To create a config file you can edit, save the current settings:

from voice_conductor import load_settings

settings = load_settings()
settings.save_settings("voice_conductor.config.jsonc")

Provider selection follows voice_conductor.provider_chain. When speak() or synthesize_voice() does not specify a provider, the manager uses the first available provider in that chain.

Providers

Built-in providers:

Provider Use case Availability
elevenlabs Hosted high-quality voices. Requires API key.
kokoro Local Kokoro synthesis. Requires the kokoro extra and model access.
azure Azure neural voices. Requires Speech key and region.
windows Installed Windows System.Speech voices. Requires Windows speech support.
demo Offline test voice. No external service.

List available providers:

from voice_conductor import TTSManager

tts = TTSManager()
print(tts.list_providers())

List voices for a provider:

for voice in tts.list_voices("windows"):
    print(voice.id, voice.name)

Audio Routes

Routes are named outputs. The default route names are:

  • speakers
  • mic

You can pass one route name or a list of route names:

tts.speak("Hello, world!", routes="speakers")
tts.speak("Hello, world!", routes=["speakers", "mic"])

The mic route resolves to an output device because virtual microphone tools such as VB-CABLE and VoiceMeeter expose playback endpoints that chat apps receive as microphone input. See docs/mic-setup.md for a virtual microphone setup guide.

Cache

Synthesized phrases are cached in SQLite. Cache entries are keyed by:

  • text
  • provider
  • normalized voice key
  • provider settings that affect audio output

Useful cache methods:

tts.invalidate_synthesis_cache(text="Hello, world!")
tts.invalidate_synthesis_cache(provider="elevenlabs")
tts.clear_synthesis_cache()

Pass refresh_cache=True to regenerate a phrase and replace the cached entry:

tts.speak("New take.", refresh_cache=True)

Background Playback

Use background=True when the caller should continue immediately:

task = tts.speak("Now we're not blocking the main thread.", routes="mic", background=True)
result = task.result(timeout=10)

Push-To-Talk Hooks

Playback hooks run after audio and routes are ready and after playback completes. They are useful for pressing and releasing push-to-talk around virtual mic playback.

from voice_conductor import PlaybackHooks, TTSManager

tts = TTSManager()

tts.speak(
    "Check out the playback hooks.",
    routes="mic",
    hooks=PlaybackHooks(
        on_audio_ready=lambda event: press_push_to_talk(),
        on_playback_complete=lambda event: release_push_to_talk(),
    ),
)

Documentation

  • docs/general.md: architecture, settings, providers, and shared types.
  • docs/mic-setup.md: virtual microphone setup and troubleshooting.
  • examples/: small runnable examples.

Development

Install the project locally:

pip install -e ".[kokoro]"

Run tests:

pytest

Run a focused test file:

pytest tests/test_manager.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voiceconductor-0.1.1.tar.gz (68.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voiceconductor-0.1.1-py3-none-any.whl (54.6 kB view details)

Uploaded Python 3

File details

Details for the file voiceconductor-0.1.1.tar.gz.

File metadata

  • Download URL: voiceconductor-0.1.1.tar.gz
  • Upload date:
  • Size: 68.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.1.tar.gz
Algorithm Hash digest
SHA256 b55fa5014dd6384bd3fdfc18a191e378180631f34dfb28c44c5e07d1da9d73ba
MD5 481cf418388c89beafd4b738f18bd816
BLAKE2b-256 ec7e17efccd7ea24c33e569094ce899dce57ae0736fa733e1fae4d278a35d537

See more details on using hashes here.

File details

Details for the file voiceconductor-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: voiceconductor-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 54.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 5ba89bab4fbe6ebbc58c9ea764d46c8945ae4b99a405d28f595bfba03dded613
MD5 9056df819ef1ecea9a2d7239c46ceab5
BLAKE2b-256 0b0ff3f8822ed09d79743f61a7adb299c3b537b29dc1cee4bb2401f55e32fe6e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page