Skip to main content

Importable text-to-speech package with Windows audio routing support.

Project description

VoiceConductor

A Python package for generating and routing synthesized voice lines. Supports output to speakers or virtual microphones.

Entry point: TTSManager

Features

  • Provider fallback across ElevenLabs, Kokoro, Azure Speech, Windows Speech, and the built-in demo provider.
  • SQLite phrase caching so repeated lines do not need to be synthesized again.
  • Named playback routes for speakers and virtual mic devices.
  • Background playback tasks for non-blocking speech.
  • Playback lifecycle hooks for push-to-talk workflows.
  • JSON/JSONC configuration for provider credentials, default voices, route settings, and cache paths.

Requirements

  • Python 3.11 or newer.
  • Audio playback support via sounddevice.
  • Optional provider dependencies and credentials depending on the backend you want to use.

Installation

Install the package:

pip install VoiceConductor

Install with Kokoro support:

pip install "VoiceConductor[kokoro]"

For local development from a checkout:

pip install -e ".[kokoro]"

Quick Start

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("This is a test.", routes="speakers")

Route to a virtual mic:

from voice_conductor import TTSManager

tts = TTSManager()
tts.speak("Now, to the virtual microphone.", routes="mic")

Route to both speakers and mic:

tts.speak("Routed to both output devices.", routes=["speakers", "mic"])

Synthesize once, then route the resulting audio:

audio = tts.synthesize_voice("This audio sample is stored in audio.")
result = tts.route(audio, routes=["speakers", "mic"])
print(result.routes)

Configuration

By default, voice_conductor looks for one of these files in the current working directory:

  • voice_conductor.config.jsonc
  • voice_conductor.config.json

If neither file exists, defaults are used. To create a config file you can edit, save the current settings:

from voice_conductor import load_settings

settings = load_settings()
settings.save_settings("voice_conductor.config.jsonc")

Provider selection follows voice_conductor.provider_chain. When speak() or synthesize_voice() does not specify a provider, the manager uses the first available provider in that chain.

Providers

Built-in providers:

Provider Use case Availability
elevenlabs Hosted high-quality voices. Requires API key.
kokoro Local Kokoro synthesis. Requires the kokoro extra and model access.
azure Azure neural voices. Requires Speech key and region.
windows Installed Windows System.Speech voices. Requires Windows speech support.
demo Offline test voice. No external service.

List available providers:

from voice_conductor import TTSManager

tts = TTSManager()
print(tts.list_providers())

List voices for a provider:

for voice in tts.list_voices("windows"):
    print(voice.id, voice.name)

Audio Routes

Routes are named outputs. The default route names are:

  • speakers
  • mic

You can pass one route name or a list of route names:

tts.speak("Hello, world!", routes="speakers")
tts.speak("Hello, world!", routes=["speakers", "mic"])

The mic route resolves to an output device because virtual microphone tools such as VB-CABLE and VoiceMeeter expose playback endpoints that chat apps receive as microphone input. See docs/mic-setup.md for a virtual microphone setup guide.

Cache

Synthesized phrases are cached in SQLite. Cache entries are keyed by:

  • text
  • provider
  • normalized voice key
  • provider settings that affect audio output

Useful cache methods:

tts.invalidate_synthesis_cache(text="Hello, world!")
tts.invalidate_synthesis_cache(provider="elevenlabs")
tts.clear_synthesis_cache()

Pass refresh_cache=True to regenerate a phrase and replace the cached entry:

tts.speak("New take.", refresh_cache=True)

Background Playback

Use background=True when the caller should continue immediately:

task = tts.speak("Now we're not blocking the main thread.", routes="mic", background=True)
result = task.result(timeout=10)

Push-To-Talk Hooks

Playback hooks run after audio and routes are ready and after playback completes. They are useful for pressing and releasing push-to-talk around virtual mic playback.

from voice_conductor import PlaybackHooks, TTSManager

tts = TTSManager()

tts.speak(
    "Check out the playback hooks.",
    routes="mic",
    hooks=PlaybackHooks(
        on_audio_ready=lambda event: press_push_to_talk(),
        on_playback_complete=lambda event: release_push_to_talk(),
    ),
)

Documentation

  • docs/general.md: architecture, settings, providers, and shared types.
  • docs/mic-setup.md: virtual microphone setup and troubleshooting.
  • examples/: small runnable examples.

Development

Install the project locally:

pip install -e ".[kokoro]"

Run tests:

pytest

Run a focused test file:

pytest tests/test_manager.py

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

voiceconductor-0.1.0.tar.gz (67.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

voiceconductor-0.1.0-py3-none-any.whl (53.6 kB view details)

Uploaded Python 3

File details

Details for the file voiceconductor-0.1.0.tar.gz.

File metadata

  • Download URL: voiceconductor-0.1.0.tar.gz
  • Upload date:
  • Size: 67.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.0.tar.gz
Algorithm Hash digest
SHA256 62cd2fdac8697207e275feb87ba60bf040a2f0c7669cea68a9dcb5b3f2efc572
MD5 9999d9407524d2b3f4bf4e75b6e8ffe8
BLAKE2b-256 9c214eceee2b86667c65018f0f7f87c4743c4345b826c27de0a93f2dea9e29b6

See more details on using hashes here.

File details

Details for the file voiceconductor-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: voiceconductor-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 53.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.4

File hashes

Hashes for voiceconductor-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fdc7c2a530f47baa9361f9c3fa2cbf2979961b6c88c760a3ff849c2890382c21
MD5 359e7417e62e1674b6fd80abe81297b4
BLAKE2b-256 a67d5d47c2903635fea91beb3b4aa60a4bd14d52e7f11fcd3313df412a1d1627

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page