Shunyalabs ASR & TTS services for Pipecat

These details have not been verified by PyPI

Project links

Project description

pipecat-shunyalabs

Shunyalabs STT and TTS services for Pipecat.

Provides ShunyalabsSTTService and ShunyalabsTTSService that integrate with Pipecat's pipeline framework, backed by the Shunyalabs Python SDK.

Key capabilities:

Real-time streaming ASR with interim and final transcription frames
High-fidelity voice synthesis with 46 speakers across 23 languages
11 emotion/delivery style tags for expressive voice responses
Native Pipecat frame protocol — drop-in with any Pipecat pipeline
Persistent WebSocket for STT; per-request WebSocket for TTS
Output formats: PCM, WAV, MP3, OGG Opus, FLAC, mu-law, A-law

Installation

Requirements: Python 3.8+, Pipecat framework, a valid Shunyalabs API key.

pip install pipecat-shunyalabs

Install with a transport:

# Daily WebRTC transport
pip install pipecat-shunyalabs pipecat-ai[daily]

Authentication

Set your API key as an environment variable (recommended):

export SHUNYALABS_API_KEY="your-api-key"

Or pass it directly:

stt = ShunyalabsSTTService(api_key="your-api-key")
tts = ShunyalabsTTSService(api_key="your-api-key")

Security: Never commit API keys to source control. Use a secrets manager (GCP Secret Manager, AWS Secrets Manager, HashiCorp Vault) in production.

Quick Start

import asyncio, os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.local.audio import LocalAudioTransport
from pipecat_shunyalabs import ShunyalabsSTTService, ShunyalabsTTSService

async def main():
    transport = LocalAudioTransport()

    stt = ShunyalabsSTTService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        language="en",
    )

    llm = OpenAILLMService(
        api_key=os.environ["OPENAI_API_KEY"],
        model="gpt-4o",
    )

    tts = ShunyalabsTTSService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        voice="Rajesh",
        language="en",
        style="<Conversational>",
    )

    pipeline = Pipeline([transport.input(), stt, llm, tts, transport.output()])
    task = PipelineTask(pipeline, PipelineParams(allow_interruptions=True))
    await PipelineRunner().run(task)

if __name__ == "__main__":
    asyncio.run(main())

STT — `ShunyalabsSTTService`

Real-time streaming speech-to-text over WebSocket. Maintains a persistent connection for the lifetime of the pipeline. Supports 23 Indian and international languages with automatic language detection.

Parameters

Parameter	Type	Default	Description
`api_key`	`str`	`None`	API key. Falls back to `SHUNYALABS_API_KEY` env var.
`language`	`str`	`"auto"`	Language code (e.g. `"en"`, `"hi"`) or `"auto"` for auto-detection.
`url`	`str`	`wss://asr.shunyalabs.ai/ws`	WebSocket endpoint URL.
`sample_rate`	`int`	`16000`	Expected audio sample rate in Hz. Must match transport input.

How It Works

On pipeline start, opens a WebSocket connection to the Shunyalabs ASR gateway.
Audio chunks from the pipeline input are forwarded via send_audio().
The gateway's built-in VAD detects speech boundaries and emits transcription events.
Events are mapped to Pipecat frames and pushed into the pipeline.

Frame Mapping

Shunyalabs Event	Pipecat Frame
`PARTIAL`	`InterimTranscriptionFrame` — emitted continuously as speech is recognized
`FINAL_SEGMENT`	`TranscriptionFrame` — emitted at speech segment boundary
`FINAL`	`TranscriptionFrame` — emitted when full utterance is finalized

Example

from pipecat_shunyalabs import ShunyalabsSTTService

stt = ShunyalabsSTTService(
    language="hi",  # Hindi; or 'auto' for detection
    sample_rate=16000,
)

Auto-Reconnect

If the WebSocket connection drops during audio streaming, the service automatically reconnects and resumes sending audio.

TTS — `ShunyalabsTTSService`

Streaming text-to-speech over WebSocket. Each synthesis request opens a new connection, streams audio chunks back as TTSAudioRawFrame frames. Supports 46 speakers across 23 languages — any speaker can synthesize in any language.

Parameters

Parameter	Type	Default	Description
`api_key`	`str`	`None`	API key. Falls back to `SHUNYALABS_API_KEY` env var.
`url`	`str`	`wss://tts.shunyalabs.ai/ws`	WebSocket endpoint URL.
`model`	`str`	`"zero-indic"`	TTS model identifier.
`voice`	`str`	`"Rajesh"`	Speaker voice. See Available Speakers.
`speaker`	`str`	`"Rajesh"`	Speaker identifier (typically same as `voice`).
`style`	`str`	`"<Neutral>"`	Emotion/delivery style tag. See Style Tags.
`language`	`str`	`"en"`	Output language code (e.g. `"en"`, `"hi"`, `"ta"`).
`output_format`	`str`	`"pcm"`	Audio encoding. See Output Formats.
`speed`	`float`	`1.0`	Speaking speed multiplier (0.25–4.0).

Output Formats

Format	Value	Recommended Use
PCM (raw 16-bit)	`pcm`	Real-time pipelines, Pipecat `TTSAudioRawFrame`
WAV	`wav`	Uncompressed storage, offline processing
MP3	`mp3`	Compressed storage, web delivery
OGG Opus	`ogg_opus`	Compressed web streaming
FLAC	`flac`	Lossless compressed storage
mu-law	`mulaw`	Telephony systems (G.711)
A-law	`alaw`	Telephony systems (G.711 European)

Style Tags

Tag	Description
`<Neutral>`	Clean read-speech — default
`<Happy>`	Joyful, upbeat tone
`<Sad>`	Somber, melancholic tone
`<Angry>`	Forceful, intense tone
`<Fearful>`	Anxious, trembling tone
`<Surprised>`	Exclamatory, astonished tone
`<Disgust>`	Repulsed, disapproving tone
`<News>`	Formal news-anchor style
`<Conversational>`	Casual, everyday speech — recommended for voice agents
`<Narrative>`	Storytelling / audiobook delivery style
`<Enthusiastic>`	Energetic, passionate tone

Text Formatting

The service automatically formats text as "<Style> text" before sending to the API:

tts = ShunyalabsTTSService(speaker="Rajesh", style="<Happy>")
# Input: "Welcome!"
# Sent:  "<Happy> Welcome!"

Available Speakers

46 speakers across 23 languages (1 male + 1 female per language). Every speaker can synthesize in any language.

Language	Male	Female
English	Varun	Nisha
Hindi	Rajesh (default)	Sunita
Bengali	Arjun	Priyanka
Tamil	Murugan	Thangam
Telugu	Vishnu	Lakshmi
Kannada	Kiran	Shreya
Malayalam	Krishnan	Deepa
Marathi	Siddharth	Ananya
Gujarati	Rakesh	Pooja
Punjabi	Gurpreet	Simran
Urdu	Salman	Fatima
Odia	Bijay	Sujata
Assamese	Bimal	Anjana
Maithili	Suresh	Meera
Nepali	Bikash	Sapana
Sanskrit	Vedant	Gayatri
Kashmiri	Farooq	Habba
Konkani	Mohan	Sarita
Dogri	Vishal	Neelam
Sindhi	Amjad	Kavita
Manipuri	Tomba	Ibemhal
Santali	Chandu	Roshni
Bodo	Daimalu	Hasina

Frame Output

Frame	Description
`TTSStartedFrame`	Emitted when synthesis begins.
`TTSAudioRawFrame`	Emitted for each audio chunk (PCM, 16kHz, mono).
`TTSStoppedFrame`	Emitted when synthesis completes.

Example

from pipecat_shunyalabs import ShunyalabsTTSService

tts = ShunyalabsTTSService(
    model="zero-indic",
    voice="Nisha",
    speaker="Nisha",
    style="<Enthusiastic>",
    language="en",
    speed=1.1,
    output_format="pcm",
)

Full Pipeline Example

A complete voice agent using Shunyalabs STT and TTS with OpenAI LLM on the Daily WebRTC transport:

import asyncio, os
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineParams, PipelineTask
from pipecat.processors.aggregators.openai_llm_context import (
    OpenAILLMContext, OpenAILLMContextAggregator,
)
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport
from pipecat_shunyalabs import ShunyalabsSTTService, ShunyalabsTTSService

async def run_voice_agent(room_url: str, token: str):
    transport = DailyTransport(
        room_url, token, "Shunyalabs Agent",
        DailyParams(audio_out_enabled=True, transcription_enabled=False),
    )

    stt = ShunyalabsSTTService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        language="auto",
        sample_rate=16000,
    )

    llm = OpenAILLMService(
        api_key=os.environ["OPENAI_API_KEY"],
        model="gpt-4o",
    )

    messages = [{
        "role": "system",
        "content": (
            "You are a helpful voice assistant powered by Shunyalabs. "
            "Keep responses concise and natural for voice delivery."
        ),
    }]
    context = OpenAILLMContext(messages)
    context_aggregator = llm.create_context_aggregator(context)

    tts = ShunyalabsTTSService(
        api_key=os.environ["SHUNYALABS_API_KEY"],
        voice="Rajesh",
        language="hi",
        style="<Conversational>",
    )

    pipeline = Pipeline([
        transport.input(),
        stt,
        context_aggregator.user(),
        llm,
        tts,
        transport.output(),
        context_aggregator.assistant(),
    ])

    task = PipelineTask(
        pipeline,
        PipelineParams(allow_interruptions=True, enable_metrics=True),
    )

    @transport.event_handler("on_first_participant_joined")
    async def on_first_participant_joined(transport, participant):
        await task.queue_frames([context_aggregator.user().get_context_frame()])

    await PipelineRunner().run(task)

if __name__ == "__main__":
    asyncio.run(run_voice_agent(
        room_url=os.environ["DAILY_ROOM_URL"],
        token=os.environ["DAILY_TOKEN"],
    ))

Multilingual Example

# Hindi conversational bot
tts = ShunyalabsTTSService(
    voice="Rajesh",
    language="hi",
    style="<Conversational>",
)

# English news-style bot
tts = ShunyalabsTTSService(
    voice="Varun",
    language="en",
    style="<News>",
)

Error Reference

All Shunyalabs SDK exceptions inherit from ShunyalabsError.

Exception	HTTP Code	Description
`AuthenticationError`	401	Invalid or missing API key.
`PermissionDeniedError`	403	API key lacks permission for the resource.
`NotFoundError`	404	Requested resource not found.
`RateLimitError`	429	Rate limit exceeded. Implement exponential backoff.
`ServerError`	5xx	Server-side error. Retried automatically.
`TimeoutError`	—	Request exceeded timeout (default 60s).
`ConnectionError`	—	Network connectivity issue.
`TranscriptionError`	—	ASR-specific failure (e.g. unsupported audio format).
`SynthesisError`	—	TTS-specific failure (e.g. invalid voice parameter).

from shunyalabs.exceptions import AuthenticationError, RateLimitError, ShunyalabsError

try:
    result = await client.tts.synthesize(text, config=config)
except AuthenticationError:
    print("Invalid API key — check SHUNYALABS_API_KEY")
except RateLimitError as e:
    print(f"Rate limited — retry after {e.retry_after}s")
except ShunyalabsError as e:
    print(f"Unexpected error: {e}")

Troubleshooting

Symptom	Resolution
`AuthenticationError` on startup	Verify `SHUNYALABS_API_KEY` is set and valid.
WebSocket connection refused	Ensure outbound WSS (port 443) is open to `asr.shunyalabs.ai` and `tts.shunyalabs.ai`.
No transcription output	Check `sample_rate` matches your transport input. Verify audio source is active.
TTS audio silent or missing	Ensure `output_format=pcm` matches transport output. Verify `TTSStartedFrame` is received.
High latency on first TTS chunk	Deploy closer to the Shunyalabs gateway region (`asia-south1`).
`RateLimitError`	Implement exponential backoff. Check `e.retry_after`.
`ImportError: pipecat_shunyalabs`	Run `pip install pipecat-shunyalabs`. Confirm virtual environment is activated.

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.4

Apr 17, 2026

This version

1.0.3

Apr 16, 2026

1.0.2

Apr 16, 2026

1.0.1

Apr 13, 2026

1.0.0

Mar 16, 2026

0.1.0

Mar 8, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pipecat_shunyalabs-1.0.3.tar.gz (18.1 kB view details)

Uploaded Apr 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pipecat_shunyalabs-1.0.3-py3-none-any.whl (14.6 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file pipecat_shunyalabs-1.0.3.tar.gz.

File metadata

Download URL: pipecat_shunyalabs-1.0.3.tar.gz
Upload date: Apr 16, 2026
Size: 18.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for pipecat_shunyalabs-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`9b52e154e5d3803afee9d714f815f06b1a166d4cb2b334cfd746e73002259640`
MD5	`4b36bd3f74400443538d462f6483779c`
BLAKE2b-256	`363794293df4ca0da08741f4ec50373fcf07be48029d850aa5ad807457e7c57f`

See more details on using hashes here.

File details

Details for the file pipecat_shunyalabs-1.0.3-py3-none-any.whl.

File metadata

Download URL: pipecat_shunyalabs-1.0.3-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 14.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.11

File hashes

Hashes for pipecat_shunyalabs-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5202ee13e659f8abd48de8a505a385e55072ccd24f72fb34d457de3891449d35`
MD5	`d143ef558cdb34ec5970531faa1c0483`
BLAKE2b-256	`8d7a516576e9f9df22d62963c2ccfa3e68abb0d0414d46d3e0f62a5b2a417084`

See more details on using hashes here.

pipecat-shunyalabs 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

pipecat-shunyalabs

Installation

Authentication

Quick Start

STT — ShunyalabsSTTService

Parameters

How It Works

Frame Mapping

Example

Auto-Reconnect

TTS — ShunyalabsTTSService

Parameters

Output Formats

Style Tags

Text Formatting

Available Speakers

Frame Output

Example

Full Pipeline Example

Multilingual Example

Error Reference

Troubleshooting

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

STT — `ShunyalabsSTTService`

TTS — `ShunyalabsTTSService`