Agent Framework plugin for voice synthesis and speech-to-text with Inworld's API.
Project description
Inworld plugin for LiveKit Agents
Support for voice synthesis and speech-to-text with Inworld TTS and Inworld STT.
See Inworld TTS and Inworld STT for more information.
Installation
pip install livekit-plugins-inworld
Authentication
Set INWORLD_API_KEY in your .env file (get one here).
Usage
TTS
Use Inworld TTS within an AgentSession or as a standalone speech generator.
from livekit.plugins import inworld
tts = inworld.TTS()
Or with options:
from livekit.plugins import inworld
tts = inworld.TTS(
voice="Hades", # voice ID (default or custom cloned voice)
model="inworld-tts-1", # or "inworld-tts-1-max"
encoding="OGG_OPUS", # LINEAR16, MP3, OGG_OPUS, ALAW, MULAW, FLAC
sample_rate=48000, # 8000-48000 Hz
bit_rate=64000, # bits per second (for compressed formats)
speaking_rate=1.0, # 0.5-1.5
temperature=1.1, # 0-2
timestamp_type="WORD", # WORD, CHARACTER, or TIMESTAMP_TYPE_UNSPECIFIED
text_normalization="OFF", # ON, OFF, or APPLY_TEXT_NORMALIZATION_UNSPECIFIED
)
TTS Streaming
Inworld TTS supports WebSocket streaming for lower latency real-time synthesis. Use the
stream() method for streaming text as it's generated:
from livekit.plugins import inworld
tts = inworld.TTS(
voice="Hades",
model="inworld-tts-1",
buffer_char_threshold=100, # chars before triggering synthesis (default: 100)
max_buffer_delay_ms=3000, # max buffer time in ms (default: 3000)
)
# Create a stream for real-time synthesis
stream = tts.stream()
# Push text incrementally
stream.push_text("Hello, ")
stream.push_text("how are you today?")
stream.flush() # Flush any remaining buffered text
stream.end_input() # Signal end of input
# Consume audio as it's generated
async for audio in stream:
# Process audio frames
pass
STT
Use Inworld STT for streaming speech-to-text. Multiple models are supported.
from livekit.plugins import inworld
session = AgentSession(
stt=inworld.STT()
# ... llm, tts, etc.
)
With a specific model and voice profile detection:
from livekit.plugins import inworld
session = AgentSession(
stt=inworld.STT(
model="inworld/inworld-stt-1",
enable_voice_profile=True,
)
# ... llm, tts, etc.
)
Example
A full voice agent using Inworld for both STT and TTS:
"""Inworld STT + TTS voice agent example.
Demonstrates using Inworld for both speech-to-text and text-to-speech
in a LiveKit voice agent. Save this as ``inworld_agent.py`` and run:
uv run inworld_agent.py console # local console mode
uv run inworld_agent.py dev # LiveKit Cloud (requires LIVEKIT_URL,
# LIVEKIT_API_KEY, LIVEKIT_API_SECRET)
Then connect via https://agents-playground.livekit.io
"""
import logging
from dotenv import load_dotenv
from livekit.agents import (
Agent,
AgentServer,
AgentSession,
JobContext,
JobProcess,
cli,
metrics,
room_io,
)
from livekit.plugins import inworld, silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel
logger = logging.getLogger("inworld-agent")
load_dotenv()
class InworldAgent(Agent):
def __init__(self) -> None:
super().__init__(
instructions=(
"Your name is Nova. You interact with users via voice. "
"Keep your responses concise and to the point. "
"Do not use emojis, asterisks, markdown, or other special characters. "
"You are helpful, curious, and friendly."
),
)
async def on_enter(self):
self.session.generate_reply()
server = AgentServer()
def prewarm(proc: JobProcess):
proc.userdata["vad"] = silero.VAD.load()
server.setup_fnc = prewarm
@server.rtc_session()
async def entrypoint(ctx: JobContext):
ctx.log_context_fields = {"room": ctx.room.name}
session = AgentSession(
stt=inworld.STT(model="inworld/inworld-stt-1"),
llm="openai/gpt-4.1-mini",
tts=inworld.TTS(voice="Clive"),
turn_detection=MultilingualModel(),
vad=ctx.proc.userdata["vad"],
)
usage_collector = metrics.UsageCollector()
@session.on("metrics_collected")
def _on_metrics(ev):
metrics.log_metrics(ev.metrics)
usage_collector.collect(ev.metrics)
async def log_usage():
logger.info(f"Usage: {usage_collector.get_summary()}")
ctx.add_shutdown_callback(log_usage)
await session.start(
agent=InworldAgent(),
room=ctx.room,
room_options=room_io.RoomOptions(),
)
if __name__ == "__main__":
cli.run_app(server)
Combined TTS + STT
from livekit.plugins import inworld
session = AgentSession(
tts=inworld.TTS(voice="Hades"),
stt=inworld.STT(),
# ... llm, etc.
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livekit_plugins_inworld-1.5.9.tar.gz.
File metadata
- Download URL: livekit_plugins_inworld-1.5.9.tar.gz
- Upload date:
- Size: 21.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4773629e055b889ca6af117f6e9f3e9a7be236832b3489b5e5fbc49c98c9cc97
|
|
| MD5 |
f06f32d82bdbf86e5d632b2b8641a4be
|
|
| BLAKE2b-256 |
c21f86e171953c634eb5bcfc22506d97aa002197cd7b0ba4752560ce81f9e179
|
Provenance
The following attestation bundles were made for livekit_plugins_inworld-1.5.9.tar.gz:
Publisher:
publish.yml on livekit/agents
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
livekit_plugins_inworld-1.5.9.tar.gz -
Subject digest:
4773629e055b889ca6af117f6e9f3e9a7be236832b3489b5e5fbc49c98c9cc97 - Sigstore transparency entry: 1525576485
- Sigstore integration time:
-
Permalink:
livekit/agents@ef9d8fbfa762c6909a29ed753a720eb135ccd5d6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/livekit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ef9d8fbfa762c6909a29ed753a720eb135ccd5d6 -
Trigger Event:
pull_request
-
Statement type:
File details
Details for the file livekit_plugins_inworld-1.5.9-py3-none-any.whl.
File metadata
- Download URL: livekit_plugins_inworld-1.5.9-py3-none-any.whl
- Upload date:
- Size: 23.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a2baddb7bc5fca69103778c454499397ffc2e5141a6260492936196ad869c593
|
|
| MD5 |
2543aa79f9a8c95fbe80d52b9a8e14d4
|
|
| BLAKE2b-256 |
4afdf0767e45ee3c6aded9e98e77929d3a8d5aa66e521758c8032538b7459771
|
Provenance
The following attestation bundles were made for livekit_plugins_inworld-1.5.9-py3-none-any.whl:
Publisher:
publish.yml on livekit/agents
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
livekit_plugins_inworld-1.5.9-py3-none-any.whl -
Subject digest:
a2baddb7bc5fca69103778c454499397ffc2e5141a6260492936196ad869c593 - Sigstore transparency entry: 1525576500
- Sigstore integration time:
-
Permalink:
livekit/agents@ef9d8fbfa762c6909a29ed753a720eb135ccd5d6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/livekit
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ef9d8fbfa762c6909a29ed753a720eb135ccd5d6 -
Trigger Event:
pull_request
-
Statement type: