Skip to main content

Inworld TTS SDK – generate, stream, and voice management

Project description

inworld-tts

PyPI version Python 3.10+

Python SDK for the Inworld TTS API — generate, stream, and manage voices.

API Reference · Changelog · Platform


Install

pip install inworld-tts

Requires Python 3.10+.


Authentication

Pass your API key directly or set INWORLD_API_KEY in your environment:

export INWORLD_API_KEY=your_api_key
from inworld_tts import InworldTTS

tts = InworldTTS()                        # reads INWORLD_API_KEY from env
tts = InworldTTS(api_key="your_api_key")  # or pass directly

Get your key at platform.inworld.ai.


Quickstart

from inworld_tts import InworldTTS

tts = InworldTTS()
tts.generate("Hello, world!", voice="Dennis", output_file="hello.mp3")

Models

Model ID Quality Default for
inworld-tts-1.5-max Higher quality generate()
inworld-tts-1.5-mini Lower latency stream()

Use max when quality is the priority (e.g. audiobooks, voiceovers). Use mini for real-time use cases (e.g. voice assistants).


Constructor

tts = InworldTTS(
    api_key="your_key",
    timeout=120,                 # HTTP timeout in seconds (default: per-method)
    max_concurrent_requests=4,   # parallel chunk requests for long text (default: 2)
    max_retries=2,               # retry on network errors / 5xx with exponential backoff (default: 2)
    debug=True,                  # log requests, responses, and timing
)

See Constructor in the API Reference for full parameter details and per-method timeout defaults.


generate()

Synthesize speech from text of any length. Blocks until all audio is ready.

# Save to file
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3")

# Get bytes for further processing
audio = tts.generate("Hello!", voice="Dennis")

# Generate, save, and play
tts.generate("Hello!", voice="Dennis", output_file="hello.mp3", play=True)

stream()

Async streaming — first audio chunk arrives faster than generate(). Max 2000 characters per call.

import asyncio

async def main():
    async for chunk in tts.stream("Hello, world!", voice="Dennis"):
        pass  # process chunk (bytes) as it arrives

asyncio.run(main())

Timestamps

generate_with_timestamps() and stream_with_timestamps() return word- or character-level timing alongside audio.

result = tts.generate_with_timestamps("Hello, world!", voice="Dennis", timestamp_type="WORD")
wa = result["timestamps"]["wordAlignment"]
for word, start, end in zip(wa["words"], wa["wordStartTimeSeconds"], wa["wordEndTimeSeconds"]):
    print(f"{word}: {start:.2f}s – {end:.2f}s")

See generate_with_timestamps() and stream_with_timestamps() for full details.


play()

Play audio from bytes or a file path. Encoding is auto-detected from magic bytes.

audio = tts.generate("Hello!", voice="Dennis")
tts.play(audio)

tts.play("hello.mp3")               # file path also accepted
tts.play(pcm_bytes, encoding="PCM") # encoding hint required for raw PCM/ALAW/MULAW

See play() for platform player details.


list_voices()

List voices in your workspace, with optional language filter.

voices = tts.list_voices()
voices = tts.list_voices(lang="EN_US")
voices = tts.list_voices(lang=["EN_US", "ES_ES"])

get_voice()

Get details of a specific voice.

voice = tts.get_voice("workspace__my_clone")

update_voice()

Update a voice's display name, description, or tags.

tts.update_voice("workspace__my_clone", display_name="Narrator", tags=["calm"])

delete_voice()

Delete a voice from your workspace.

tts.delete_voice("workspace__my_clone")

clone_voice()

Clone a voice from one or more audio recordings (WAV/MP3).

result = tts.clone_voice(["sample.wav"], display_name="My Clone")
voice_id = result["voice"]["voiceId"]

design_voice()

Design a voice from a text description (no recording needed), then publish the preview.

result = tts.design_voice(
    design_prompt="A warm, friendly narrator",
    preview_text="Hello, welcome to our audiobook.",
)
voice_id = result["previewVoices"][0]["voiceId"]

publish_voice()

Publish a designed or cloned voice preview to your library.

tts.publish_voice(voice_id, display_name="My Custom Voice")

migrate_from_elevenlabs()

Migrate a voice from ElevenLabs to your Inworld workspace. No ElevenLabs SDK required.

result = tts.migrate_from_elevenlabs("el_api_key", "el_voice_id")
print(result["elevenlabs_name"], "→", result["inworld_voice_id"])

See Voice Management in the API Reference for all parameters.


Errors

Exception When
MissingApiKeyError No API key found at construction
ApiError API returned 4xx/5xx — has .code and .details
NetworkError Connection or timeout failure

All inherit from InworldTTSError.

from inworld_tts import ApiError, MissingApiKeyError, NetworkError

try:
    audio = tts.generate("Hello!", voice="Dennis")
except MissingApiKeyError as e:
    print(f"Missing API key: {e}")
except ApiError as e:
    print(f"HTTP {e.code}: {e}")
except NetworkError as e:
    print(f"Network error: {e}")

CLI

The API key is read from INWORLD_API_KEY or passed with --api-key. Voice defaults to Dennis; use --voice to choose another. Run inworld-tts --help for all options.

# synthesize text (voice defaults to Dennis)
inworld-tts "Hello, world!" -o hello.mp3

# choose a voice
inworld-tts "Hello" -o hello.mp3 --voice Sarah

# read from a text file (any length)
inworld-tts story.txt -o story.mp3 --voice Dennis

# choose a model
inworld-tts "Hello" -o hello.mp3 --voice Dennis --model inworld-tts-1.5-max

# stream (lower latency to first audio)
inworld-tts "Hello" -o hello.mp3 --voice Dennis --stream

# play audio immediately (no output file needed)
inworld-tts "Hello world" --voice Dennis --play

# save and play
inworld-tts story.txt --voice Dennis --play -o story.mp3

# other formats
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding WAV

# audio quality options
inworld-tts "Hello" -o hello.mp3 --voice Dennis --bit-rate 192000
inworld-tts "Hello" -o hello.wav --voice Dennis --encoding LINEAR16 --sample-rate 44100

List voices (CLI)

inworld-tts list-voices
inworld-tts list-voices --lang EN_US

Migrate from ElevenLabs (CLI)

inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123

# preview first (no cloning)
inworld-tts migrate-from-elevenlabs --elevenlabs-key el_... --voice-id abc123 --dry-run

Examples

Runnable examples are in the examples/ directory:

File What it shows
hello_world.py Text → MP3 in 3 lines
stream_audio.py Real-time streaming — play each chunk as it arrives
list_voices.py List all available voices, with optional language filter
clone_voice.py Clone a voice from a WAV/MP3 recording
design_voice.py Design a voice from a text description, preview, and publish
generate_timestamps.py Word-level timestamps — print each word's start/end time
stream_timestamps.py Per-chunk timestamps while streaming

Troubleshooting

MissingApiKeyError / ApiError 401

Set INWORLD_API_KEY or pass api_key= directly. If the key is set but rejected, regenerate it at platform.inworld.ai.

stream() requires async

stream() is an async generator — call it inside an async function:

import asyncio

async def main():
    async for chunk in tts.stream("Hello", voice="Dennis"):
        ...

asyncio.run(main())

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

inworld_tts-1.0.0.tar.gz (39.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

inworld_tts-1.0.0-py3-none-any.whl (35.3 kB view details)

Uploaded Python 3

File details

Details for the file inworld_tts-1.0.0.tar.gz.

File metadata

  • Download URL: inworld_tts-1.0.0.tar.gz
  • Upload date:
  • Size: 39.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.0.0.tar.gz
Algorithm Hash digest
SHA256 df55bafb0471276fb88c3514e549e3d5547d166474acde0d1222e918aeb2c72b
MD5 a9f5fc0940ffcccfe4e4819078220a0a
BLAKE2b-256 1cdaac3086779c54e4bffc193941539326eeab82a652800e1cf4d9e36fb0bd49

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.0.0.tar.gz:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file inworld_tts-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: inworld_tts-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 35.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for inworld_tts-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 645ce5af330f365744aa7526dcd6dbbb97bba14b4233f9e02f307864ac42e3b0
MD5 b2fc3713ff763255e268326a83e76e44
BLAKE2b-256 f78a2908d13d8d7214e0347e5e24b54125f958dbb64965fae60b703321ee9eba

See more details on using hashes here.

Provenance

The following attestation bundles were made for inworld_tts-1.0.0-py3-none-any.whl:

Publisher: release.yml on inworld-ai/inworld-tts-python

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page