Skip to main content

Inworld TTS SDK – generate, stream, and voice management

Project description

inworld-tts

PyPI version Python 3.10+

Python SDK for the Inworld TTS API — generate, stream, and manage voices.

API Reference · Changelog · Platform


Install

pip install inworld-tts

Requires Python 3.10+.


Authentication

Pass your API key directly or set INWORLD_API_KEY in your environment:

export INWORLD_API_KEY=your_api_key
from inworld_tts import InworldTTS

tts = InworldTTS()                        # reads INWORLD_API_KEY from env
tts = InworldTTS(api_key="your_api_key")  # or pass directly

Get your key at platform.inworld.ai.


Quickstart

from inworld_tts import InworldTTS

tts = InworldTTS()
tts.generate("Hello, world!", voice="Dennis", output_file="hello.mp3")

Models

Model ID Notes
inworld-tts-2 Recommended. Latest generation. Supports delivery_mode ("STABLE" / "BALANCED" / "EXPRESSIVE") for output variability; temperature is ignored.
inworld-tts-1.5-max Previous generation. Higher quality. Default for generate() / generate_with_timestamps().
inworld-tts-1.5-mini Previous generation. Lower latency. Default for stream() / stream_with_timestamps().

Use inworld-tts-2 for new applications — pass model="inworld-tts-2" to any of the synthesis methods. The 1.5 family remains available and is the default for backwards compatibility.


Constructor

tts = InworldTTS(
    api_key="your_key",
    timeout=120,                 # HTTP timeout in seconds (default: per-method)
    max_concurrent_requests=4,   # parallel chunk requests for long text (default: 2)
    max_retries=2,               # retry on network errors / 5xx with exponential backoff (default: 2)
    debug=True,                  # log requests, responses, and timing
)

See Constructor in the API Reference for full parameter details and per-method timeout defaults.


generate()

Synthesize speech from text of any length. Blocks until all audio is ready.

# Save to file
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3")

# Get bytes for further processing
audio = tts.generate("Hello!", voice="Dennis")

# Generate, save, and play
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3", play=True)

stream()

Async streaming — first audio chunk arrives faster than generate(). Max 2000 characters per call.

import asyncio

async def main():
    async for chunk in tts.stream("Hello, world!", voice="Dennis"):
        pass  # process chunk (bytes) as it arrives

asyncio.run(main())

Timestamps

generate_with_timestamps() and stream_with_timestamps() return word- or character-level timing alongside audio.

result = tts.generate_with_timestamps("Hello, world!", voice="Dennis", timestamp_type="WORD")
wa = result["timestamps"]["wordAlignment"]
for word, start, end in zip(wa["words"], wa["wordStartTimeSeconds"], wa["wordEndTimeSeconds"]):
    print(f"{word}: {start:.2f}s – {end:.2f}s")

See generate_with_timestamps() and stream_with_timestamps() for full details.


play()

Play audio from bytes or a file path. Encoding is auto-detected from magic bytes.

audio = tts.generate("Hello!", voice="Dennis")
tts.play(audio)

tts.play("hello.mp3")               # file path also accepted
tts.play(pcm_bytes, encoding="PCM") # encoding hint required for raw PCM/ALAW/MULAW

See play() for platform player details.


list_voices()

List voices in your workspace, with optional language filter.

voices = tts.list_voices()
voices = tts.list_voices(lang="EN_US")
voices = tts.list_voices(lang=["EN_US", "ES_ES"])

get_voice()

Get details of a specific voice.

voice = tts.get_voice("workspace__my_clone")

update_voice()

Update a voice's display name, description, or tags.

tts.update_voice("workspace__my_clone", display_name="Narrator", tags=["calm"])

delete_voice()

Delete a voice from your workspace.

tts.delete_voice("workspace__my_clone")

clone_voice()

Clone a voice from one or more audio recordings (WAV/MP3).

result = tts.clone_voice(["sample.wav"], display_name="My Clone")
voice_id = result["voice"]["voiceId"]

design_voice()

Design a voice from a text description (no recording needed), then publish the preview.

result = tts.design_voice(
    design_prompt="A warm, friendly narrator",
    preview_text="Hello, welcome to our audiobook.",
)
voice_id = result["previewVoices"][0]["voiceId"]

publish_voice()

Publish a designed or cloned voice preview to your library.

tts.publish_voice(voice_id, display_name="My Custom Voice")

migrate_from_elevenlabs()

Migrate a voice from ElevenLabs to your Inworld workspace. No ElevenLabs SDK required.

result = tts.migrate_from_elevenlabs("el_api_key", "el_voice_id")
print(result["elevenlabs_name"], "→", result["inworld_voice_id"])

See Voice Management in the API Reference for all parameters.


Errors

Exception When
MissingApiKeyError No API key found at construction
ApiError API returned 4xx/5xx — has .code and .details
NetworkError Connection or timeout failure

All inherit from InworldTTSError.

from inworld_tts import ApiError, MissingApiKeyError, NetworkError

try:
    audio = tts.generate("Hello!", voice="Dennis")
except MissingApiKeyError as e:
    print(f"Missing API key: {e}")
except ApiError as e:
    print(f"HTTP {e.code}: {e}")
except NetworkError as e:
    print(f"Network error: {e}")

CLI

The API key is read from INWORLD_API_KEY or passed with --api-key. Voice defaults to Dennis; use --voice to choose another. Run inworld-tts --help for all options.

# synthesize text (voice defaults to Dennis)
inworld-tts "Hello, world!" -o hello.mp3

# choose a voice
inworld-tts "Hello" -o hello.mp3 --voice Sarah

# read from a text file (any length)
inworld-tts story.txt -o story.mp3 --voice Dennis

# choose a model (inworld-tts-2 is the recommended latest generation)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --model inworld-tts-2

# stream (lower latency to first audio)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --stream

# play audio immediately (no output file needed)
inworld-tts "Hello world" --voice Dennis --play

# save and play
inworld-tts story.txt --voice Dennis --play -o story.mp3

# other formats
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding WAV

# audio quality options
inworld-tts "Hello" -o hello.mp3 --voice Dennis --bit-rate 192000
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding LINEAR16 --sample-rate 44100

List voices (CLI)

inworld-tts list-voices
inworld-tts list-voices --lang EN_US

Migrate from ElevenLabs (CLI)

inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123

# preview first (no cloning)
inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123 --dry-run

Examples

Runnable examples are in the examples/ directory:

File What it shows
hello_world.py Text → MP3 in 3 lines
stream_audio.py Real-time streaming — play each chunk as it arrives
list_voices.py List all available voices, with optional language filter
clone_voice.py Clone a voice from a WAV/MP3 recording
design_voice.py Design a voice from a text description, preview, and publish
generate_timestamps.py Word-level timestamps — print each word's start/end time
stream_timestamps.py Per-chunk timestamps while streaming

Troubleshooting

MissingApiKeyError / ApiError 401

Set INWORLD_API_KEY or pass api_key= directly. If the key is set but rejected, regenerate it at platform.inworld.ai.

stream() requires async

stream() is an async generator — call it inside an async function:

import asyncio

async def main():
    async for chunk in tts.stream("Hello", voice="Dennis"):
        ...

asyncio.run(main())

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inworld_tts-1.2.0.tar.gz (39.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inworld_tts-1.2.0-py3-none-any.whl (35.9 kB view details)

Uploaded Python 3

File details

Details for the file inworld_tts-1.2.0.tar.gz.

File metadata

  • Download URL: inworld_tts-1.2.0.tar.gz
  • Upload date:
  • Size: 39.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.2.0.tar.gz
Algorithm Hash digest
SHA256 322bfcc324ac5fdc35fb81f84179525544d9b9129fec50123f66dd0d4f8cb439
MD5 3a9993f31f0647147e90b6209bd7c78a
BLAKE2b-256 f47f5c38b1a328d5f39acca0350df4fbf703067d54851f05b07889dc2aab4467

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.2.0.tar.gz:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file inworld_tts-1.2.0-py3-none-any.whl.

File metadata

  • Download URL: inworld_tts-1.2.0-py3-none-any.whl
  • Upload date:
  • Size: 35.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fe1249c5f38992de3d1ae14a44a954b9ca53b56b95214e4cbdc409e87ba9eb31
MD5 d8c675ba7310d15b49107ab16d46d3c2
BLAKE2b-256 0f3c72e5d9fae5a962eb93944b451054340b0bfb86729ea65e1893003d09b9b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.2.0-py3-none-any.whl:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page