Inworld TTS SDK – generate, stream, and voice management
Project description
inworld-tts
Python SDK for the Inworld TTS API — generate, stream, and manage voices.
API Reference · Changelog · Platform
Install
pip install inworld-tts
Requires Python 3.10+.
Authentication
Pass your API key directly or set INWORLD_API_KEY in your environment:
export INWORLD_API_KEY=your_api_key
from inworld_tts import InworldTTS
tts = InworldTTS() # reads INWORLD_API_KEY from env
tts = InworldTTS(api_key="your_api_key") # or pass directly
Get your key at platform.inworld.ai.
Quickstart
from inworld_tts import InworldTTS
tts = InworldTTS()
tts.generate("Hello, world!", voice="Dennis", output_file="hello.mp3")
Models
| Model ID | Quality | Default for |
|---|---|---|
inworld-tts-1.5-max |
Higher quality | generate() |
inworld-tts-1.5-mini |
Lower latency | stream() |
Use max when quality is the priority (e.g. audiobooks, voiceovers). Use mini for real-time use cases (e.g. voice assistants).
Constructor
tts = InworldTTS(
api_key="your_key",
timeout=120, # HTTP timeout in seconds (default: per-method)
max_concurrent_requests=4, # parallel chunk requests for long text (default: 2)
max_retries=2, # retry on network errors / 5xx with exponential backoff (default: 2)
debug=True, # log requests, responses, and timing
)
See Constructor in the API Reference for full parameter details and per-method timeout defaults.
generate()
Synthesize speech from text of any length. Blocks until all audio is ready.
# Save to file
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3")
# Get bytes for further processing
audio = tts.generate("Hello!", voice="Dennis")
# Generate, save, and play
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3", play=True)
stream()
Async streaming — first audio chunk arrives faster than generate(). Max 2000 characters per call.
import asyncio
async def main():
async for chunk in tts.stream("Hello, world!", voice="Dennis"):
pass # process chunk (bytes) as it arrives
asyncio.run(main())
Timestamps
generate_with_timestamps() and stream_with_timestamps() return word- or character-level timing alongside audio.
result = tts.generate_with_timestamps("Hello, world!", voice="Dennis", timestamp_type="WORD")
wa = result["timestamps"]["wordAlignment"]
for word, start, end in zip(wa["words"], wa["wordStartTimeSeconds"], wa["wordEndTimeSeconds"]):
print(f"{word}: {start:.2f}s – {end:.2f}s")
See generate_with_timestamps() and stream_with_timestamps() for full details.
play()
Play audio from bytes or a file path. Encoding is auto-detected from magic bytes.
audio = tts.generate("Hello!", voice="Dennis")
tts.play(audio)
tts.play("hello.mp3") # file path also accepted
tts.play(pcm_bytes, encoding="PCM") # encoding hint required for raw PCM/ALAW/MULAW
See play() for platform player details.
list_voices()
List voices in your workspace, with optional language filter.
voices = tts.list_voices()
voices = tts.list_voices(lang="EN_US")
voices = tts.list_voices(lang=["EN_US", "ES_ES"])
get_voice()
Get details of a specific voice.
voice = tts.get_voice("workspace__my_clone")
update_voice()
Update a voice's display name, description, or tags.
tts.update_voice("workspace__my_clone", display_name="Narrator", tags=["calm"])
delete_voice()
Delete a voice from your workspace.
tts.delete_voice("workspace__my_clone")
clone_voice()
Clone a voice from one or more audio recordings (WAV/MP3).
result = tts.clone_voice(["sample.wav"], display_name="My Clone")
voice_id = result["voice"]["voiceId"]
design_voice()
Design a voice from a text description (no recording needed), then publish the preview.
result = tts.design_voice(
design_prompt="A warm, friendly narrator",
preview_text="Hello, welcome to our audiobook.",
)
voice_id = result["previewVoices"][0]["voiceId"]
publish_voice()
Publish a designed or cloned voice preview to your library.
tts.publish_voice(voice_id, display_name="My Custom Voice")
migrate_from_elevenlabs()
Migrate a voice from ElevenLabs to your Inworld workspace. No ElevenLabs SDK required.
result = tts.migrate_from_elevenlabs("el_api_key", "el_voice_id")
print(result["elevenlabs_name"], "→", result["inworld_voice_id"])
See Voice Management in the API Reference for all parameters.
Errors
| Exception | When |
|---|---|
MissingApiKeyError |
No API key found at construction |
ApiError |
API returned 4xx/5xx — has .code and .details |
NetworkError |
Connection or timeout failure |
All inherit from InworldTTSError.
from inworld_tts import ApiError, MissingApiKeyError, NetworkError
try:
audio = tts.generate("Hello!", voice="Dennis")
except MissingApiKeyError as e:
print(f"Missing API key: {e}")
except ApiError as e:
print(f"HTTP {e.code}: {e}")
except NetworkError as e:
print(f"Network error: {e}")
CLI
The API key is read from INWORLD_API_KEY or passed with --api-key. Voice defaults to Dennis; use --voice to choose another. Run inworld-tts --help for all options.
# synthesize text (voice defaults to Dennis)
inworld-tts "Hello, world!" -o hello.mp3
# choose a voice
inworld-tts "Hello" -o hello.mp3 --voice Sarah
# read from a text file (any length)
inworld-tts story.txt -o story.mp3 --voice Dennis
# choose a model
inworld-tts "Hello" -o hello.mp3 --voice Dennis --model inworld-tts-1.5-max
# stream (lower latency to first audio)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --stream
# play audio immediately (no output file needed)
inworld-tts "Hello world" --voice Dennis --play
# save and play
inworld-tts story.txt --voice Dennis --play -o story.mp3
# other formats
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding WAV
# audio quality options
inworld-tts "Hello" -o hello.mp3 --voice Dennis --bit-rate 192000
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding LINEAR16 --sample-rate 44100
List voices (CLI)
inworld-tts list-voices
inworld-tts list-voices --lang EN_US
Migrate from ElevenLabs (CLI)
inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123
# preview first (no cloning)
inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123 --dry-run
Examples
Runnable examples are in the examples/ directory:
| File | What it shows |
|---|---|
hello_world.py |
Text → MP3 in 3 lines |
stream_audio.py |
Real-time streaming — play each chunk as it arrives |
list_voices.py |
List all available voices, with optional language filter |
clone_voice.py |
Clone a voice from a WAV/MP3 recording |
design_voice.py |
Design a voice from a text description, preview, and publish |
generate_timestamps.py |
Word-level timestamps — print each word's start/end time |
stream_timestamps.py |
Per-chunk timestamps while streaming |
Troubleshooting
MissingApiKeyError / ApiError 401
Set INWORLD_API_KEY or pass api_key= directly. If the key is set but rejected, regenerate it at platform.inworld.ai.
stream() requires async
stream() is an async generator — call it inside an async function:
import asyncio
async def main():
async for chunk in tts.stream("Hello", voice="Dennis"):
...
asyncio.run(main())
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file inworld_tts-1.0.0.tar.gz.
File metadata
- Download URL: inworld_tts-1.0.0.tar.gz
- Upload date:
- Size: 39.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df55bafb0471276fb88c3514e549e3d5547d166474acde0d1222e918aeb2c72b
|
|
| MD5 |
a9f5fc0940ffcccfe4e4819078220a0a
|
|
| BLAKE2b-256 |
1cdaac3086779c54e4bffc193941539326eeab82a652800e1cf4d9e36fb0bd49
|
Provenance
The following attestation bundles were made for inworld_tts-1.0.0.tar.gz:
Publisher:
release.yml on inworld-ai/inworld-tts-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
inworld_tts-1.0.0.tar.gz -
Subject digest:
df55bafb0471276fb88c3514e549e3d5547d166474acde0d1222e918aeb2c72b - Sigstore transparency entry: 1354480816
- Sigstore integration time:
-
Permalink:
inworld-ai/inworld-tts-python@51f92c111275a19fbcb1b426d0ff214c99b08a52 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/inworld-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
self-hosted -
Publication workflow:
release.yml@51f92c111275a19fbcb1b426d0ff214c99b08a52 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file inworld_tts-1.0.0-py3-none-any.whl.
File metadata
- Download URL: inworld_tts-1.0.0-py3-none-any.whl
- Upload date:
- Size: 35.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
645ce5af330f365744aa7526dcd6dbbb97bba14b4233f9e02f307864ac42e3b0
|
|
| MD5 |
b2fc3713ff763255e268326a83e76e44
|
|
| BLAKE2b-256 |
f78a2908d13d8d7214e0347e5e24b54125f958dbb64965fae60b703321ee9eba
|
Provenance
The following attestation bundles were made for inworld_tts-1.0.0-py3-none-any.whl:
Publisher:
release.yml on inworld-ai/inworld-tts-python
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
inworld_tts-1.0.0-py3-none-any.whl -
Subject digest:
645ce5af330f365744aa7526dcd6dbbb97bba14b4233f9e02f307864ac42e3b0 - Sigstore transparency entry: 1354480881
- Sigstore integration time:
-
Permalink:
inworld-ai/inworld-tts-python@51f92c111275a19fbcb1b426d0ff214c99b08a52 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/inworld-ai
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
self-hosted -
Publication workflow:
release.yml@51f92c111275a19fbcb1b426d0ff214c99b08a52 -
Trigger Event:
workflow_dispatch
-
Statement type: