Skip to main content

Rumik text-to-speech services for Pipecat

Project description

pipecat-rumik

Text-to-speech service implementations for Pipecat using Rumik AI's TTS APIs.

Overview

pipecat-rumik provides two Pipecat TTS services:

  • RumikTTSService for WebSocket-based synthesis in interactive voice pipelines.
  • RumikHttpTTSService for HTTP request/response synthesis.

The package follows Pipecat's service conventions: constructor-level provider configuration, runtime-configurable Settings, raw PCM audio frames, metrics, and standard service connection events.

from pipecat_rumik import RumikHttpTTSService, RumikTTSService, RumikTTSSettings

Compatibility and Maintenance

This package is maintained by Rumik AI. The current implementation is tested with Pipecat 1.3.0 and supports Pipecat >=1.0.0,<2.

Installation

pip install pipecat-rumik

To run the included voice pipeline examples, install the examples extra:

pip install "pipecat-rumik[examples]"

For local development:

uv sync --extra dev --extra examples

Quick Start

import os

from pipecat_rumik import RumikTTSService

tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(
        model="muga",
    ),
)

For Mulberry expressive voices:

import os

from pipecat_rumik import RumikTTSService

tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(
        model="mulberry",
        voice="speaker_1",
        description=(
            "warm expressive Indian woman with clear Hinglish diction, natural "
            "pauses, gentle energy, and a friendly conversational delivery"
        ),
        f0_up_key=3,
    ),
)

Prerequisites

Before using either service, configure access to the Rumik AI gateway:

export RUMIK_API_KEY=...
export RUMIK_GATEWAY_URL=...

Create API keys from the Rumik AI dashboard. Use the gateway URL provided for your Rumik AI deployment.

The full Pipecat voice pipeline examples also require:

export DEEPGRAM_API_KEY=...
export OPENAI_API_KEY=...

Optional environment variables used by the smoke-test examples:

export RUMIK_MODEL=muga
export RUMIK_SPEAKER=
export RUMIK_DESCRIPTION=
export RUMIK_F0_UP_KEY=
export RUMIK_TEMPERATURE=
export RUMIK_TOP_P=
export RUMIK_TOP_K=
export RUMIK_REPETITION_PENALTY=
export RUMIK_MAX_NEW_TOKENS=

Leave optional values empty to use Rumik's API defaults. Set RUMIK_MODEL=mulberry with RUMIK_SPEAKER, RUMIK_DESCRIPTION, or RUMIK_F0_UP_KEY when testing expressive voices.

Service Selection

Service Transport Recommended Use
RumikTTSService WebSocket Interactive Pipecat voice agents that need interruption-aware TTS.
RumikHttpTTSService HTTP Simpler synthesis flows where a request/response API is preferred.

Use the WebSocket service for conversational applications. Use the HTTP service for batch-style synthesis or integration tests that do not need a persistent TTS connection.

Configuration

RumikTTSService

Parameter Type Required Description
api_key str yes Rumik AI API key.
gateway_url str yes Rumik AI gateway base URL.
settings RumikTTSService.Settings | None no Runtime-configurable TTS settings.
sample_rate int | None no Output sample rate. Rumik currently emits 24 kHz PCM.
full_response_aggregation bool no Buffer a complete LLM response before sending it to Rumik. Defaults to True.

RumikHttpTTSService

Parameter Type Required Description
api_key str yes Rumik AI API key.
gateway_url str yes Rumik AI gateway base URL.
aiohttp_session aiohttp.ClientSession yes Caller-owned HTTP session.
settings RumikHttpTTSService.Settings | None no Runtime-configurable TTS settings.
sample_rate int | None no Output sample rate. Rumik currently emits 24 kHz PCM.

Settings

Both services use Pipecat's service settings pattern:

settings=RumikTTSService.Settings(...)
settings=RumikHttpTTSService.Settings(...)

RumikTTSService.Settings and RumikHttpTTSService.Settings are aliases of RumikTTSSettings.

Setting Type Default Used By Description
model str | None "muga" HTTP, WS Rumik model identifier.
voice str | None None HTTP, WS Preset speaker voice. Sent to Rumik as speaker.
language Language | str | None None inherited Reserved for Pipecat provider compatibility.
description str | None None HTTP, WS Natural-language voice/style description for expressive models.
f0_up_key int | None None HTTP, WS Pitch shift in semitones for preset speaker voices.
temperature float | None None HTTP, WS Sampling temperature. When omitted, Rumik uses its API default.
top_p float | None None HTTP, WS Nucleus sampling value. When omitted, Rumik uses its API default.
top_k int | None None HTTP, WS Top-k sampling value. When omitted, Rumik uses its API default.
repetition_penalty float | None None HTTP, WS Penalty applied to repeated tokens. When omitted, Rumik uses its API default.
max_new_tokens int | None None HTTP, WS Maximum generated audio tokens. When omitted, Rumik uses its API default.

Runtime updates use Pipecat's TTSUpdateSettingsFrame with RumikTTSSettings.

Usage

WebSocket Service

import os

from pipecat_rumik import RumikTTSService

tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(
        model="muga",
    ),
)

For expressive models, use Pipecat's inherited voice setting for preset speaker voices. The service sends it to Rumik as speaker.

tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    settings=RumikTTSService.Settings(
        model="mulberry",
        voice="speaker_1",
        description=(
            "warm expressive Indian woman with clear Hinglish diction, natural "
            "pauses, gentle energy, and a friendly conversational delivery"
        ),
        f0_up_key=3,
    ),
)

By default, RumikTTSService uses full-response aggregation. This sends one complete assistant response to Rumik instead of creating a separate TTS request for every sentence.

tts = RumikTTSService(
    api_key=os.environ["RUMIK_API_KEY"],
    gateway_url=os.environ["RUMIK_GATEWAY_URL"],
    full_response_aggregation=False,
)

HTTP Service

import asyncio
import os

import aiohttp

from pipecat_rumik import RumikHttpTTSService


async def main():
    async with aiohttp.ClientSession() as session:
        tts = RumikHttpTTSService(
            api_key=os.environ["RUMIK_API_KEY"],
            gateway_url=os.environ["RUMIK_GATEWAY_URL"],
            aiohttp_session=session,
            settings=RumikHttpTTSService.Settings(
                model="mulberry",
                voice="speaker_2",
                description="calm female narrator",
                f0_up_key=0,
                temperature=0.6,
                top_p=0.95,
                top_k=50,
                repetition_penalty=1.2,
                max_new_tokens=2048,
            ),
        )


asyncio.run(main())

The HTTP service uses the caller-owned aiohttp.ClientSession. Create and close the session in your application code.

Notes

  • WebSocket vs HTTP: The WebSocket service is intended for interactive Pipecat conversations. The HTTP service is useful for simpler batch-style synthesis.
  • Request lifecycle: The WebSocket service keeps one active synthesis request per connection so audio chunks are routed deterministically to the active Pipecat audio context.
  • Interruption handling: On interruption, the WebSocket service closes the active socket and opens a fresh session before accepting the next synthesis request.
  • Audio format: Rumik currently emits 24 kHz, mono, signed 16-bit PCM. The services validate provider responses against this audio contract before emitting Pipecat audio frames.
  • Voice selection: Use voice for preset speaker voices. The service sends this setting to Rumik as speaker.
  • Model steering: muga can be steered with tone tags in the input text. Expressive models can use description, optional preset voice, and f0_up_key.
  • HTTP response handling: The HTTP service validates the WAV response, removes the WAV container, and emits raw PCM frames.

Event Handlers

RumikTTSService supports Pipecat's standard service connection events:

Event Description
on_connected Connected to the Rumik WebSocket service.
on_disconnected Disconnected from the Rumik WebSocket service.
on_connection_error WebSocket connection error occurred.
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Rumik")

Examples

The repository includes two full Pipecat voice pipeline examples:

uv run --extra examples examples/voice/voice-rumik-muga.py -t webrtc --host localhost --port 7860
uv run --extra examples examples/voice/voice-rumik-mulberry.py -t webrtc --host localhost --port 7861

Open the Pipecat WebRTC client for the selected port and talk to the bot:

  • Muga: http://localhost:7860/client/
  • Mulberry: http://localhost:7861/client/

The voice examples implement a simple, expressive Indian voice assistant for everyday questions, quick decisions, small plans, and short messages. The Muga example prompts the LLM to emit the required tone tags. The Mulberry example uses speaker_1, a warm expressive Indian woman voice description, and f0_up_key=3 by default.

The repository also includes lower-level provider smoke tests:

uv run python examples/smoke_rumik_http.py
uv run python examples/smoke_rumik_ws.py
uv run python examples/smoke_rumik_ws_suite.py

The smoke tests require real Rumik credentials. Unit tests do not.

Testing

Offline checks:

uv run --extra dev pytest -q
uv run --extra dev --extra examples ruff check src tests examples scripts
uv run --extra dev --extra examples python -m compileall -q src tests examples scripts
uv build
uv run --with twine twine check dist/*

See TESTING.md for the full test checklist and RELEASE.md for TestPyPI/PyPI release steps.

License

This package is released under the MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipecat_rumik-0.1.3.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pipecat_rumik-0.1.3-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file pipecat_rumik-0.1.3.tar.gz.

File metadata

  • Download URL: pipecat_rumik-0.1.3.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pipecat_rumik-0.1.3.tar.gz
Algorithm Hash digest
SHA256 767b9bc57ab1b68af5a908f16446aaa43a87092e9456350e06c9367d9d3117b3
MD5 80387cab261e83c8f80946c2e9a2ebb6
BLAKE2b-256 b5f011dec158219c551588c377744142d719efeccebff40774a99605bffe0ae5

See more details on using hashes here.

File details

Details for the file pipecat_rumik-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pipecat_rumik-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for pipecat_rumik-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c24ae3098c1af6383ce906bd939fd303505afa2e07e7ce91fc9cefcc62b78311
MD5 f47ae27cc859bfb2458ad05d6982e198
BLAKE2b-256 20381f4a166ad10b8bb2fdf4a0b67fabfdda14af55ffcf0530354025b0f6e70e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page