Inworld AI integration for Vision Agents (TTS + Realtime WebRTC)

These details have not been verified by PyPI

Project links

Project description

Inworld AI Plugin

Inworld AI integration for Vision Agents. Provides both text-to-speech and a WebRTC-based Realtime speech-to-speech conversational API.

Installation

uv add "vision-agents[inworld]"
# or directly
uv add vision-agents-plugins-inworld

Get your API key from the Inworld Portal and set INWORLD_API_KEY in your environment (or pass api_key= explicitly).

TTS

High-quality text-to-speech with streaming support. The plugin now defaults to Inworld's TTS-2 model (currently in research preview), which adds natural-language steering, 100+ languages (15 GA, 90+ experimental), and high-quality instant voice cloning over the previous inworld-tts-1.5-* generation.

from vision_agents.plugins import inworld

# Defaults to model_id="inworld-tts-2", voice_id="Sarah"
tts = inworld.TTS()

# Or specify explicitly
tts = inworld.TTS(
    api_key="your_inworld_api_key",
    voice_id="Ashley",
    model_id="inworld-tts-2",
    temperature=1.1,
)

TTS options

api_key: Inworld AI API key (default: reads from INWORLD_API_KEY)
voice_id: Voice to use (default: "Sarah"; "Dennis", "Ashley", "Olivia", "Clive" and custom/cloned voices also supported)
model_id: "inworld-tts-2" (default), "inworld-tts-1.5-max", "inworld-tts-1.5-mini". "inworld-tts-1" and "inworld-tts-1-max" are deprecated by Inworld — migrate to inworld-tts-2 or inworld-tts-1.5-*.
temperature: 0–2 (default: 1.1)

The plugin requests LINEAR16 (16-bit PCM WAV) chunks from Inworld so each streamed chunk is self-contained and decodes cleanly under streaming TTS; no extra configuration needed.

Steering (TTS-2)

TTS-2 takes natural-language stage directions inline with your text. Place the instruction in square brackets before the segment it should apply to:

text = (
    "[whisper in a hushed style] I have to tell you something. "
    "[laugh] Just kidding! [say with force] Now let's get to work."
)
async for chunk in await tts.stream_audio(text):
    ...

Steering covers articulation, intonation, volume, pitch, range, speed, and vocal style — and supports non-verbal sounds like [laugh], [breathe], [clear throat], [sigh], [cough], [yawn]. Combining dimensions ([whisper in a hushed style], [say playfully and very fast]) produces better results than bare single-word tags. See Inworld's steering docs and prompting guide for the full reference.

Agent example

A complete example wiring inworld.TTS() into a Stream-edge agent with Deepgram STT, Gemini LLM, and smart-turn detection lives at example/inworld_tts_example.py. The companion example/inworld-audio-guide.md is loaded as the agent's system prompt and teaches the LLM how to emit TTS-2 steering tags so replies sound expressive out of the box.

Realtime (WebRTC)

Low-latency speech-to-speech via Inworld's Realtime API. This transport uses WebRTC (UDP, native Opus) for lower latency than the WebSocket alternative. Requires a WebRTC-capable edge transport — pair with getstream.Edge() as shown below.

from vision_agents.core import Agent, User
from vision_agents.plugins import getstream, inworld, smart_turn

agent = Agent(
    edge=getstream.Edge(),
    agent_user=User(name="My Agent", id="agent"),
    llm=inworld.Realtime(
        model="openai/gpt-4o-mini",
        voice="Dennis",
        instructions="You are a friendly voice assistant.",
    ),
    turn_detection=smart_turn.TurnDetection(),
)

Realtime options

model: provider-prefixed model ID. Examples: "openai/gpt-4o-mini" (default), "google-ai-studio/gemini-2.5-flash", "inworld/<router-id>" for an Inworld router
voice: voice for audio responses (default: "Dennis"; "Clive", "Olivia" and custom voices also supported)
api_key: Inworld AI API key (default: reads from INWORLD_API_KEY)
instructions: system prompt
realtime_session: advanced — pass a full RealtimeSessionCreateRequestParam for session fields not exposed by the primary args (custom turn-detection, tool_choice, etc.)

Registering tools

realtime = inworld.Realtime()

@realtime.register_function(description="Get the current weather for a city.")
async def get_weather(city: str) -> str:
    return f"It's sunny in {city}."

Tools follow the OpenAI function-calling schema. Inworld's Realtime API is protocol-compatible with OpenAI's Realtime API, so registered functions flow through the same response.function_call_arguments.done path.

Notes

v1 is WebRTC only; a WebSocket transport may be added later.
Video input is not currently supported by Inworld's Realtime API.

Requirements

Python 3.10+
httpx>=0.28, av>=10, aiortc>=1.9, openai[realtime]>=2.26,<3

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.6.1

May 20, 2026

0.6.0

May 18, 2026

0.5.9

May 15, 2026

This version

0.5.8

May 13, 2026

0.5.7

May 7, 2026

0.5.6

May 5, 2026

0.5.5

Apr 27, 2026

0.5.4

Apr 15, 2026

0.5.3

Apr 14, 2026

0.5.2

Apr 13, 2026

0.5.1

Apr 7, 2026

0.5.0

Apr 1, 2026

0.4.7

Mar 27, 2026

0.4.6

Mar 26, 2026

0.4.5

Mar 25, 2026

0.4.4

Mar 23, 2026

0.4.3

Mar 11, 2026

0.4.2

Mar 10, 2026

0.4.1

Mar 4, 2026

0.4.0

Mar 3, 2026

0.3.8

Feb 24, 2026

0.3.7

Feb 23, 2026

0.3.6

Feb 13, 2026

0.3.5

Feb 10, 2026

0.3.4

Feb 6, 2026

0.3.3

Feb 4, 2026

0.3.2

Jan 27, 2026

0.3.1

Jan 21, 2026

0.3.0

Jan 20, 2026

0.2.10

Jan 14, 2026

0.2.9

Jan 9, 2026

0.2.8

Jan 8, 2026

0.2.7

Jan 7, 2026

0.2.6

Dec 16, 2025

0.2.5

Dec 12, 2025

0.2.4

Dec 12, 2025

0.2.3

Dec 7, 2025

0.2.2

Nov 29, 2025

0.2.1

Nov 21, 2025

0.2.0

Nov 14, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_inworld-0.5.8.tar.gz (15.2 kB view details)

Uploaded May 13, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vision_agents_plugins_inworld-0.5.8-py3-none-any.whl (38.3 kB view details)

Uploaded May 13, 2026 Python 3

File details

Details for the file vision_agents_plugins_inworld-0.5.8.tar.gz.

File metadata

Download URL: vision_agents_plugins_inworld-0.5.8.tar.gz
Upload date: May 13, 2026
Size: 15.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_inworld-0.5.8.tar.gz
Algorithm	Hash digest
SHA256	`5d859b0c1bd292ba513ee4e74c0add359749a502b6d0115f347e664b00da532c`
MD5	`8330e72df0eff913212a9a4220d634b7`
BLAKE2b-256	`0b7fe00708fac6d77f203d8da4a3a491c7eb967042cf560807d7318409457de3`

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_inworld-0.5.8-py3-none-any.whl.

File metadata

Download URL: vision_agents_plugins_inworld-0.5.8-py3-none-any.whl
Upload date: May 13, 2026
Size: 38.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.10.6 {"installer":{"name":"uv","version":"0.10.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vision_agents_plugins_inworld-0.5.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5fb10edddb357ffcf141f12f51865df93ebad53617f4c91d714091bb91ef166f`
MD5	`d75dc44a91ec7313342ac457b9d93363`
BLAKE2b-256	`7a4de09fa476ab59b29eb4fbe33417a1afd179c5bb52db36c439d20cab5e4d82`

See more details on using hashes here.

vision-agents-plugins-inworld 0.5.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Inworld AI Plugin

Installation

TTS

TTS options

Steering (TTS-2)

Agent example

Realtime (WebRTC)

Realtime options

Registering tools

Notes

Requirements

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes