Skip to main content

Inworld TTS SDK – generate, stream, and voice management

Project description

inworld-tts

PyPI version Python 3.10+

Python SDK for the Inworld TTS API — generate, stream, and manage voices.

API Reference · Changelog · Platform


Install

pip install inworld-tts

Requires Python 3.10+.


Authentication

Pass your API key directly or set INWORLD_API_KEY in your environment:

export INWORLD_API_KEY=your_api_key
from inworld_tts import InworldTTS

tts = InworldTTS()                        # reads INWORLD_API_KEY from env
tts = InworldTTS(api_key="your_api_key")  # or pass directly

Get your key at platform.inworld.ai.


Quickstart

from inworld_tts import InworldTTS

tts = InworldTTS()
tts.generate("Hello, world!", voice="Dennis", output_file="hello.mp3")

Models

Model ID Notes
inworld-tts-2 Recommended. Latest generation. Supports delivery_mode ("STABLE" / "BALANCED" / "CREATIVE") for output variability; temperature is ignored.
inworld-tts-1.5-max Previous generation. Higher quality. Default for generate() / generate_with_timestamps().
inworld-tts-1.5-mini Previous generation. Lower latency. Default for stream() / stream_with_timestamps().

Use inworld-tts-2 for new applications — pass model="inworld-tts-2" to any of the synthesis methods. The 1.5 family remains available and is the default for backwards compatibility.


Constructor

tts = InworldTTS(
    api_key="your_key",
    timeout=120,                 # HTTP timeout in seconds (default: per-method)
    max_concurrent_requests=4,   # parallel chunk requests for long text (default: 2)
    max_retries=2,               # retry on network errors / 5xx with exponential backoff (default: 2)
    debug=True,                  # log requests, responses, and timing
)

See Constructor in the API Reference for full parameter details and per-method timeout defaults.


generate()

Synthesize speech from text of any length. Blocks until all audio is ready.

# Save to file
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3")

# Get bytes for further processing
audio = tts.generate("Hello!", voice="Dennis")

# Generate, save, and play
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3", play=True)

stream()

Async streaming — first audio chunk arrives faster than generate(). Max 2000 characters per call.

import asyncio

async def main():
    async for chunk in tts.stream("Hello, world!", voice="Dennis"):
        pass  # process chunk (bytes) as it arrives

asyncio.run(main())

Timestamps

generate_with_timestamps() and stream_with_timestamps() return word- or character-level timing alongside audio.

result = tts.generate_with_timestamps("Hello, world!", voice="Dennis", timestamp_type="WORD")
wa = result["timestamps"]["wordAlignment"]
for word, start, end in zip(wa["words"], wa["wordStartTimeSeconds"], wa["wordEndTimeSeconds"]):
    print(f"{word}: {start:.2f}s – {end:.2f}s")

See generate_with_timestamps() and stream_with_timestamps() for full details.


play()

Play audio from bytes or a file path. Encoding is auto-detected from magic bytes.

audio = tts.generate("Hello!", voice="Dennis")
tts.play(audio)

tts.play("hello.mp3")               # file path also accepted
tts.play(pcm_bytes, encoding="PCM") # encoding hint required for raw PCM/ALAW/MULAW

See play() for platform player details.


list_voices()

List voices in your workspace, with optional language filter.

voices = tts.list_voices()
voices = tts.list_voices(lang="EN_US")
voices = tts.list_voices(lang=["EN_US", "ES_ES"])

get_voice()

Get details of a specific voice.

voice = tts.get_voice("workspace__my_clone")

update_voice()

Update a voice's display name, description, or tags.

tts.update_voice("workspace__my_clone", display_name="Narrator", tags=["calm"])

delete_voice()

Delete a voice from your workspace.

tts.delete_voice("workspace__my_clone")

clone_voice()

Clone a voice from one or more audio recordings (WAV/MP3).

result = tts.clone_voice(["sample.wav"], display_name="My Clone")
voice_id = result["voice"]["voiceId"]

design_voice()

Design a voice from a text description (no recording needed), then publish the preview.

result = tts.design_voice(
    design_prompt="A warm, friendly narrator",
    preview_text="Hello, welcome to our audiobook.",
)
voice_id = result["previewVoices"][0]["voiceId"]

publish_voice()

Publish a designed or cloned voice preview to your library.

tts.publish_voice(voice_id, display_name="My Custom Voice")

migrate_from_elevenlabs()

Migrate a voice from ElevenLabs to your Inworld workspace. No ElevenLabs SDK required.

result = tts.migrate_from_elevenlabs("el_api_key", "el_voice_id")
print(result["elevenlabs_name"], "→", result["inworld_voice_id"])

See Voice Management in the API Reference for all parameters.


Errors

Exception When
MissingApiKeyError No API key found at construction
ApiError API returned 4xx/5xx — has .code and .details
NetworkError Connection or timeout failure

All inherit from InworldTTSError.

from inworld_tts import ApiError, MissingApiKeyError, NetworkError

try:
    audio = tts.generate("Hello!", voice="Dennis")
except MissingApiKeyError as e:
    print(f"Missing API key: {e}")
except ApiError as e:
    print(f"HTTP {e.code}: {e}")
except NetworkError as e:
    print(f"Network error: {e}")

CLI

The API key is read from INWORLD_API_KEY or passed with --api-key. Voice defaults to Dennis; use --voice to choose another. Run inworld-tts --help for all options.

# synthesize text (voice defaults to Dennis)
inworld-tts "Hello, world!" -o hello.mp3

# choose a voice
inworld-tts "Hello" -o hello.mp3 --voice Sarah

# read from a text file (any length)
inworld-tts story.txt -o story.mp3 --voice Dennis

# choose a model (inworld-tts-2 is the recommended latest generation)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --model inworld-tts-2

# stream (lower latency to first audio)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --stream

# play audio immediately (no output file needed)
inworld-tts "Hello world" --voice Dennis --play

# save and play
inworld-tts story.txt --voice Dennis --play -o story.mp3

# other formats
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding WAV

# audio quality options
inworld-tts "Hello" -o hello.mp3 --voice Dennis --bit-rate 192000
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding LINEAR16 --sample-rate 44100

List voices (CLI)

inworld-tts list-voices
inworld-tts list-voices --lang EN_US

Migrate from ElevenLabs (CLI)

inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123

# preview first (no cloning)
inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123 --dry-run

Examples

Runnable examples are in the examples/ directory:

File What it shows
hello_world.py Text → MP3 in 3 lines
stream_audio.py Real-time streaming — play each chunk as it arrives
list_voices.py List all available voices, with optional language filter
clone_voice.py Clone a voice from a WAV/MP3 recording
design_voice.py Design a voice from a text description, preview, and publish
generate_timestamps.py Word-level timestamps — print each word's start/end time
stream_timestamps.py Per-chunk timestamps while streaming

Troubleshooting

MissingApiKeyError / ApiError 401

Set INWORLD_API_KEY or pass api_key= directly. If the key is set but rejected, regenerate it at platform.inworld.ai.

stream() requires async

stream() is an async generator — call it inside an async function:

import asyncio

async def main():
    async for chunk in tts.stream("Hello", voice="Dennis"):
        ...

asyncio.run(main())

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inworld_tts-1.2.1.tar.gz (40.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inworld_tts-1.2.1-py3-none-any.whl (36.3 kB view details)

Uploaded Python 3

File details

Details for the file inworld_tts-1.2.1.tar.gz.

File metadata

  • Download URL: inworld_tts-1.2.1.tar.gz
  • Upload date:
  • Size: 40.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.2.1.tar.gz
Algorithm Hash digest
SHA256 e7157fb0b5f215dc2706f0267264c71fb46ed0555261575e00459fdd7840162a
MD5 fc315573d3403ca905aafc0197e365a0
BLAKE2b-256 61a04f3c6a6cb938ac23ca3292b4c6bb501439eabbd71c233567e3d8e48b7eeb

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.2.1.tar.gz:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file inworld_tts-1.2.1-py3-none-any.whl.

File metadata

  • Download URL: inworld_tts-1.2.1-py3-none-any.whl
  • Upload date:
  • Size: 36.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 dff2af50ba4039f702c4d64304eedad8f9a34269bcf170c19337ac9e17b6a944
MD5 3df40da206addf790424e8f21c8ed524
BLAKE2b-256 8f0f42e7f6a324192f6d5cf978d5b1769ee8744201f47eccfb68294dab7b9e14

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.2.1-py3-none-any.whl:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page