Skip to main content

Agent Framework plugin for voice synthesis and speech-to-text with Inworld's API.

Project description

Inworld plugin for LiveKit Agents

Support for voice synthesis and speech-to-text with Inworld TTS and Inworld STT.

See Inworld TTS and Inworld STT for more information.

Installation

pip install livekit-plugins-inworld

Authentication

Set INWORLD_API_KEY in your .env file (get one here).

Usage

TTS

Use Inworld TTS within an AgentSession or as a standalone speech generator.

from livekit.plugins import inworld

tts = inworld.TTS()

Or with options:

from livekit.plugins import inworld

tts = inworld.TTS(
    voice="Hades",                 # voice ID (default or custom cloned voice)
    model="inworld-tts-1",         # or "inworld-tts-1-max"
    encoding="OGG_OPUS",           # LINEAR16, MP3, OGG_OPUS, ALAW, MULAW, FLAC
    sample_rate=48000,             # 8000-48000 Hz
    bit_rate=64000,                # bits per second (for compressed formats)
    speaking_rate=1.0,             # 0.5-1.5
    temperature=1.1,               # 0-2
    timestamp_type="WORD",         # WORD, CHARACTER, or TIMESTAMP_TYPE_UNSPECIFIED
    text_normalization="OFF",      # ON, OFF, or APPLY_TEXT_NORMALIZATION_UNSPECIFIED
)

TTS Streaming

Inworld TTS supports WebSocket streaming for lower latency real-time synthesis. Use the stream() method for streaming text as it's generated:

from livekit.plugins import inworld

tts = inworld.TTS(
    voice="Hades",
    model="inworld-tts-1",
    buffer_char_threshold=100,     # chars before triggering synthesis (default: 100)
    max_buffer_delay_ms=3000,      # max buffer time in ms (default: 3000)
)

# Create a stream for real-time synthesis
stream = tts.stream()

# Push text incrementally
stream.push_text("Hello, ")
stream.push_text("how are you today?")
stream.flush()  # Flush any remaining buffered text
stream.end_input()  # Signal end of input

# Consume audio as it's generated
async for audio in stream:
    # Process audio frames
    pass

STT

Use Inworld STT for streaming speech-to-text. Multiple models are supported.

from livekit.plugins import inworld

session = AgentSession(
   stt=inworld.STT()
   # ... llm, tts, etc.
)

With a specific model and voice profile detection:

from livekit.plugins import inworld

session = AgentSession(
   stt=inworld.STT(
       model="inworld/inworld-stt-1",
       enable_voice_profile=True,
   )
   # ... llm, tts, etc.
)

Example

A full voice agent using Inworld for both STT and TTS:

"""Inworld STT + TTS voice agent example.

Demonstrates using Inworld for both speech-to-text and text-to-speech
in a LiveKit voice agent. Save this as ``inworld_agent.py`` and run:

    uv run inworld_agent.py console   # local console mode
    uv run inworld_agent.py dev       # LiveKit Cloud (requires LIVEKIT_URL,
                                      # LIVEKIT_API_KEY, LIVEKIT_API_SECRET)

Then connect via https://agents-playground.livekit.io
"""

import logging

from dotenv import load_dotenv

from livekit.agents import (
    Agent,
    AgentServer,
    AgentSession,
    JobContext,
    cli,
    inference,
    metrics,
    room_io,
)
from livekit.agents.inference import TurnDetector
from livekit.plugins import inworld

logger = logging.getLogger("inworld-agent")

load_dotenv()


class InworldAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                "Your name is Nova. You interact with users via voice. "
                "Keep your responses concise and to the point. "
                "Do not use emojis, asterisks, markdown, or other special characters. "
                "You are helpful, curious, and friendly."
            ),
        )

    async def on_enter(self):
        self.session.generate_reply()


server = AgentServer()


@server.rtc_session()
async def entrypoint(ctx: JobContext):
    ctx.log_context_fields = {"room": ctx.room.name}

    session = AgentSession(
        stt=inworld.STT(model="inworld/inworld-stt-1"),
        llm="openai/gpt-4.1-mini",
        tts=inworld.TTS(voice="Clive"),
        turn_detection=TurnDetector(),
        vad=inference.VAD(),
    )

    usage_collector = metrics.UsageCollector()

    @session.on("metrics_collected")
    def _on_metrics(ev):
        metrics.log_metrics(ev.metrics)
        usage_collector.collect(ev.metrics)

    async def log_usage():
        logger.info(f"Usage: {usage_collector.get_summary()}")

    ctx.add_shutdown_callback(log_usage)

    await session.start(
        agent=InworldAgent(),
        room=ctx.room,
        room_options=room_io.RoomOptions(),
    )


if __name__ == "__main__":
    cli.run_app(server)

Combined TTS + STT

from livekit.plugins import inworld

session = AgentSession(
   tts=inworld.TTS(voice="Hades"),
   stt=inworld.STT(),
   # ... llm, etc.
)

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

livekit_plugins_inworld-1.6.1.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

livekit_plugins_inworld-1.6.1-py3-none-any.whl (23.5 kB view details)

Uploaded Python 3

File details

Details for the file livekit_plugins_inworld-1.6.1.tar.gz.

File metadata

  • Download URL: livekit_plugins_inworld-1.6.1.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for livekit_plugins_inworld-1.6.1.tar.gz
Algorithm Hash digest
SHA256 e8c99df1fc5243d47b56ce99609815d18f44e7a73d1f0879584f124535396b39
MD5 eb7e0f2b2ed271112db67533a93dfaa6
BLAKE2b-256 fc003be3dc8cdd3ab8f9d3a6e1bbadda98bc61cc8cebe18915050a66c5bc78b4

See more details on using hashes here.

Provenance

The following attestation bundles were made for livekit_plugins_inworld-1.6.1.tar.gz:

Publisher: publish.yml on livekit/agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file livekit_plugins_inworld-1.6.1-py3-none-any.whl.

File metadata

File hashes

Hashes for livekit_plugins_inworld-1.6.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1571e62799a056827638751b6d9cbb2abe4d5faed404f73667c96d9ceb098ddb
MD5 bf408943971ba1cba66ee348fcac12db
BLAKE2b-256 1803fe8d06b30553076b31e25815dd0eed1e6c83b03f36ff4cb6d76897835b2e

See more details on using hashes here.

Provenance

The following attestation bundles were made for livekit_plugins_inworld-1.6.1-py3-none-any.whl:

Publisher: publish.yml on livekit/agents

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page