Smart TTS library with ElevenLabs and OpenRouter text enhancement
Project description
elevenlabs-smart-tts
High-level Python library for expressive text-to-speech with ElevenLabs and LLM-powered text enhancement via OpenRouter.
Pass raw text plus task context (language, style, emotion, use case) — the library picks a voice, enriches the text with Eleven v3 audio tags, and returns synthesized audio.
Features
- SmartTTS facade — one pipeline from text to audio
- Voice caching — local
diskcachecatalog with offlinelist_voices()/get_voice() - Automatic voice selection — by
voice_id, description, use case, style, and language - LLM text enhancement — audio tags, punctuation, and normalization via OpenRouter
- Eleven v3 first — expressive tags like
[whispers],[excited],[short pause] - Typed errors & retries — resilient HTTP clients for ElevenLabs and OpenRouter
Installation
pip install elevenlabs-smart-tts
Or from source:
git clone https://github.com/vpuhoff/elevenlabs-smart-tts.git
cd elevenlabs-smart-tts
uv sync --dev
Quick start
- Copy
.env.exampleto.envand fill in your API keys:
cp .env.example .env
- Run synthesis:
from pathlib import Path
from elevenlabs_smart_tts import SmartTTS, SynthesisTask
tts = SmartTTS.from_env()
tts.sync_voices()
result = tts.synthesize_to_file(
SynthesisTask(
text="Welcome to our customer support service.",
language="en",
style="professional",
emotion="warm",
use_case="conversational",
voice_description="warm professional conversational",
),
Path("output.mp3"),
)
print(result.enhanced_text)
See example.py for a full runnable example.
Task parameters
SynthesisTask accepts free-text hints that guide voice selection and LLM text enhancement. After sync_voices(), inspect your cached catalog:
for voice in tts.list_voices():
print(voice.name, voice.labels.get("use_case"), voice.description)
The examples below come from the ElevenLabs premade voice catalog.
use_case
Used for voice matching against the ElevenLabs voice label labels.use_case (exact match scores highest).
| Value | Typical voices |
|---|---|
conversational |
Casual, agentic, podcast-style voices (e.g. Roger, Eric, Juniper) |
informative_educational |
Clear educators, broadcasters (e.g. Alice, Matilda, Daniel) |
narrative_story |
Storytellers, audiobook voices (e.g. George, Daria Reels) |
advertisement |
Promo and ad reads (e.g. Bill) |
social_media |
Short-form, trendy content |
characters_animation |
Character and animation voices |
entertainment_tv |
TV and entertainment narration |
customer_support is not an ElevenLabs label — it still helps the LLM, but for voice selection prefer conversational or pass voice_description="professional support warm".
# Support-style call center message
SynthesisTask(
text="Thanks for calling. How can I help you today?",
language="en",
use_case="conversational",
style="professional",
emotion="warm",
voice_description="trustworthy professional",
)
# Audiobook / long-form narration
SynthesisTask(
text="Chapter one. It was a dark and stormy night.",
use_case="narrative_story",
style="warm",
emotion="calm",
)
# E-learning explainer
SynthesisTask(
text="Today we'll learn how photosynthesis works.",
use_case="informative_educational",
style="professional",
emotion="neutral",
)
style
Free-form delivery hint. Affects the LLM enhancement prompt and weak voice matching against voice name, description, and custom tags.
Common values that match premade voice descriptions:
| Value | Effect |
|---|---|
professional |
Formal, clear delivery |
casual / conversational |
Relaxed, everyday tone |
warm |
Friendly, inviting tone |
neutral |
Balanced, informative |
dramatic |
Strong emphasis, expressive pacing |
playful |
Light, energetic tone |
sympathetic |
Soft, empathetic delivery |
SynthesisTask(text="...", style="professional") # business / IVR
SynthesisTask(text="...", style="casual") # laid-back chat
SynthesisTask(text="...", style="dramatic") # emotional scene
emotion
Free-form mood hint for LLM text enhancement only (drives audio tags like [excited], [whispers], [sighs]). Does not filter voices.
| Value | Typical audio tag behavior |
|---|---|
warm |
Friendly, reassuring tone |
calm |
Steady, subdued delivery |
excited |
Higher energy, [excited] tags |
sympathetic |
Soft, caring tone |
curious |
Questioning, engaged tone |
appalled / sarcastic |
Strong expressive tags |
neutral |
Minimal emotional markup |
SynthesisTask(text="...", emotion="warm") # customer greeting
SynthesisTask(text="...", emotion="excited") # product launch
SynthesisTask(text="...", emotion="sympathetic") # apology or support
SynthesisTask(text="...", emotion="neutral") # plain narration
Combining parameters
| Scenario | Example values |
|---|---|
| Customer support (EN) | use_case="conversational", style="professional", emotion="warm" |
| News / podcast intro | use_case="informative_educational", style="neutral", emotion="calm" |
| Audiobook chapter | use_case="narrative_story", style="warm", emotion="calm" |
| Social reel | use_case="social_media", style="playful", emotion="excited" |
| Ad read | use_case="advertisement", style="confident", emotion="excited" |
Configuration
Required environment variables
| Variable | Description |
|---|---|
ELEVENLABS_API_KEY |
ElevenLabs API key |
OPENROUTER_API_KEY |
OpenRouter API key |
OPENROUTER_API_TTS_PROMPT_MODEL |
LLM for text enhancement (e.g. anthropic/claude-3.5-sonnet) |
Optional environment variables
| Variable | Default | Description |
|---|---|---|
ELEVENLABS_CACHE_DIR |
~/.cache/elevenlabs-smart-tts |
Local cache directory |
ELEVENLABS_DEFAULT_MODEL |
eleven_v3 |
Default TTS model |
ELEVENLABS_DEFAULT_OUTPUT_FORMAT |
mp3_44100_128 |
Audio output format |
ELEVENLABS_DEFAULT_VOICE_ID |
tnSpp4vdxKPjI9w0GnoV |
Fallback voice for TTS (may work even if absent from sync_voices) |
OPENROUTER_BASE_URL |
https://openrouter.ai/api/v1 |
OpenRouter API base URL |
Programmatic configuration is also supported:
from elevenlabs_smart_tts import SmartTTS, SmartTTSConfig, TTSModel
config = SmartTTSConfig(
elevenlabs_api_key="...",
openrouter_api_key="...",
openrouter_tts_prompt_model="anthropic/claude-3.5-sonnet",
default_model=TTSModel.ELEVEN_V3,
)
tts = SmartTTS(config)
Usage
Synthesis pipeline
from elevenlabs_smart_tts import SmartTTS, SynthesisTask, TTSModel
tts = SmartTTS.from_env()
tts.sync_voices()
result = tts.synthesize(
SynthesisTask(
text="Are you serious? I can't believe you did that!",
voice_id="your-voice-id",
model=TTSModel.ELEVEN_V3,
style="dramatic",
emotion="appalled",
)
)
audio_bytes = result.audio
enhanced_text = result.enhanced_text
Preview enhanced text without TTS
enhanced = tts.enhance_text_only(
SynthesisTask(
text="Thanks for calling. How can I help?",
language="en",
style="sympathetic",
)
)
One-liner
from elevenlabs_smart_tts import synthesize
result = synthesize(
"Hello world",
language="en",
style="neutral",
)
Async API
import asyncio
from pathlib import Path
from elevenlabs_smart_tts import AsyncSmartTTS, SynthesisTask, asynthesize
async def main() -> None:
async with AsyncSmartTTS.from_env() as tts:
await tts.sync_voices()
result = await tts.synthesize_to_file(
SynthesisTask(text="Hello world", language="en"),
Path("output.mp3"),
)
print(result.enhanced_text)
asyncio.run(main())
# Or as a one-liner:
result = asyncio.run(asynthesize("Hello world", language="en"))
Voice management
voices = tts.list_voices(language="en", tags=["narration"])
voice = tts.get_voice("voice-id")
tts.sync_voices(force=True) # refresh cache from ElevenLabs API
Supported TTS models
| Model | Best for |
|---|---|
eleven_v3 |
Expressive speech, audio tags, emotions |
eleven_multilingual_v2 |
Multilingual, high voice similarity |
eleven_flash_v2_5 |
Low latency, conversational agents |
Development
uv sync --dev
uv run pytest
uv run ruff check .
License
MIT — see LICENSE.
Links
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file elevenlabs_smart_tts-0.1.8.tar.gz.
File metadata
- Download URL: elevenlabs_smart_tts-0.1.8.tar.gz
- Upload date:
- Size: 65.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3b6623b080843c1ae9e3ba8cb55221ad90e09585d46f2933b5536fbe9f0d897
|
|
| MD5 |
65fe1ec37b023532eedf9a191bf8600a
|
|
| BLAKE2b-256 |
0c93e377689c9c15d7c9c424724ac623877a76eb0563f579a45c2c3c984540cf
|
Provenance
The following attestation bundles were made for elevenlabs_smart_tts-0.1.8.tar.gz:
Publisher:
publish.yml on vpuhoff/elevenlabs-smart-tts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
elevenlabs_smart_tts-0.1.8.tar.gz -
Subject digest:
b3b6623b080843c1ae9e3ba8cb55221ad90e09585d46f2933b5536fbe9f0d897 - Sigstore transparency entry: 1695348223
- Sigstore integration time:
-
Permalink:
vpuhoff/elevenlabs-smart-tts@cbd490fdae9cd6d622c6bef132ee28a56d4b3fea -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/vpuhoff
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cbd490fdae9cd6d622c6bef132ee28a56d4b3fea -
Trigger Event:
push
-
Statement type:
File details
Details for the file elevenlabs_smart_tts-0.1.8-py3-none-any.whl.
File metadata
- Download URL: elevenlabs_smart_tts-0.1.8-py3-none-any.whl
- Upload date:
- Size: 28.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4ff0cdd57b203dd1c01e95b48ef04613e746b36e7bdcd6f43f7b134e946319c
|
|
| MD5 |
2f4b0b8d158b359b49908d6600a3d583
|
|
| BLAKE2b-256 |
e724ba3386f847c1b8348ec567188a681e1ea527ca69f494c92f05ee431463a0
|
Provenance
The following attestation bundles were made for elevenlabs_smart_tts-0.1.8-py3-none-any.whl:
Publisher:
publish.yml on vpuhoff/elevenlabs-smart-tts
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
elevenlabs_smart_tts-0.1.8-py3-none-any.whl -
Subject digest:
a4ff0cdd57b203dd1c01e95b48ef04613e746b36e7bdcd6f43f7b134e946319c - Sigstore transparency entry: 1695348455
- Sigstore integration time:
-
Permalink:
vpuhoff/elevenlabs-smart-tts@cbd490fdae9cd6d622c6bef132ee28a56d4b3fea -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/vpuhoff
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@cbd490fdae9cd6d622c6bef132ee28a56d4b3fea -
Trigger Event:
push
-
Statement type: