Importable text-to-speech package with Windows audio routing support.
Project description
VoiceConductor
A Python package for generating and routing synthesized voice lines. Supports output to speakers or virtual microphones.
Entry point: TTSManager
Features
- Provider fallback across ElevenLabs, Kokoro, Azure Speech, Windows Speech, and the built-in demo provider.
- SQLite phrase caching so repeated lines do not need to be synthesized again.
- Named playback routes for speakers and virtual mic devices.
- Background playback tasks for non-blocking speech.
- Playback lifecycle hooks for push-to-talk workflows.
- JSON/JSONC configuration for provider credentials, default voices, route settings, and cache paths.
Requirements
- Python 3.11 or newer.
- Audio playback support via
sounddevice. - Optional provider dependencies and credentials depending on the backend you want to use.
Installation
Install the package:
pip install VoiceConductor
Install with Kokoro support:
pip install "VoiceConductor[kokoro]"
For local development from a checkout:
pip install -e ".[kokoro]"
Quick Start
from voice_conductor import TTSManager
tts = TTSManager()
tts.speak("This is a test.", routes="speakers")
Route to a virtual mic:
from voice_conductor import TTSManager
tts = TTSManager()
tts.speak("Now, to the virtual microphone.", routes="mic")
Route to both speakers and mic:
tts.speak("Routed to both output devices.", routes=["speakers", "mic"])
Synthesize once, then route the resulting audio:
audio = tts.synthesize_voice("This audio sample is stored in audio.")
result = tts.route(audio, routes=["speakers", "mic"])
print(result.routes)
Configuration
By default, voice_conductor looks for one of these files in the current working directory:
voice_conductor.config.jsoncvoice_conductor.config.json
If neither file exists, defaults are used. To create a config file you can edit, save the current settings:
from voice_conductor import load_settings
settings = load_settings()
settings.save_settings("voice_conductor.config.jsonc")
Provider selection follows voice_conductor.provider_chain. When speak() or synthesize_voice() does not specify a provider, the manager uses the first available provider in that chain.
Providers
Built-in providers:
| Provider | Use case | Availability |
|---|---|---|
elevenlabs |
Hosted high-quality voices. | Requires API key. |
kokoro |
Local Kokoro synthesis. | Requires the kokoro extra and model access. |
azure |
Azure neural voices. | Requires Speech key and region. |
windows |
Installed Windows System.Speech voices. | Requires Windows speech support. |
demo |
Offline test voice. | No external service. |
List available providers:
from voice_conductor import TTSManager
tts = TTSManager()
print(tts.list_providers())
List voices for a provider:
for voice in tts.list_voices("windows"):
print(voice.id, voice.name)
Audio Routes
Routes are named outputs. The default route names are:
speakersmic
You can pass one route name or a list of route names:
tts.speak("Hello, world!", routes="speakers")
tts.speak("Hello, world!", routes=["speakers", "mic"])
The mic route resolves to an output device because virtual microphone tools such as VB-CABLE and VoiceMeeter expose playback endpoints that chat apps receive as microphone input. See docs/mic-setup.md for a virtual microphone setup guide.
Cache
Synthesized phrases are cached in SQLite. Cache entries are keyed by:
- text
- provider
- normalized voice key
- provider settings that affect audio output
Useful cache methods:
tts.invalidate_synthesis_cache(text="Hello, world!")
tts.invalidate_synthesis_cache(provider="elevenlabs")
tts.clear_synthesis_cache()
Pass refresh_cache=True to regenerate a phrase and replace the cached entry:
tts.speak("New take.", refresh_cache=True)
Background Playback
Use background=True when the caller should continue immediately:
task = tts.speak("Now we're not blocking the main thread.", routes="mic", background=True)
result = task.result(timeout=10)
Push-To-Talk Hooks
Playback hooks run after audio and routes are ready and after playback completes. They are useful for pressing and releasing push-to-talk around virtual mic playback.
from voice_conductor import PlaybackHooks, TTSManager
tts = TTSManager()
tts.speak(
"Check out the playback hooks.",
routes="mic",
hooks=PlaybackHooks(
on_audio_ready=lambda event: press_push_to_talk(),
on_playback_complete=lambda event: release_push_to_talk(),
),
)
Documentation
docs/general.md: architecture, settings, providers, and shared types.docs/mic-setup.md: virtual microphone setup and troubleshooting.examples/: small runnable examples.
Development
Install the project locally:
pip install -e ".[kokoro]"
Run tests:
pytest
Run a focused test file:
pytest tests/test_manager.py
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file voiceconductor-0.1.0.tar.gz.
File metadata
- Download URL: voiceconductor-0.1.0.tar.gz
- Upload date:
- Size: 67.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
62cd2fdac8697207e275feb87ba60bf040a2f0c7669cea68a9dcb5b3f2efc572
|
|
| MD5 |
9999d9407524d2b3f4bf4e75b6e8ffe8
|
|
| BLAKE2b-256 |
9c214eceee2b86667c65018f0f7f87c4743c4345b826c27de0a93f2dea9e29b6
|
File details
Details for the file voiceconductor-0.1.0-py3-none-any.whl.
File metadata
- Download URL: voiceconductor-0.1.0-py3-none-any.whl
- Upload date:
- Size: 53.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdc7c2a530f47baa9361f9c3fa2cbf2979961b6c88c760a3ff849c2890382c21
|
|
| MD5 |
359e7417e62e1674b6fd80abe81297b4
|
|
| BLAKE2b-256 |
a67d5d47c2903635fea91beb3b4aa60a4bd14d52e7f11fcd3313df412a1d1627
|