Shunyalabs ASR & TTS plugin for LiveKit Agents
Project description
livekit-plugins-shunyalabs
Shunyalabs STT and TTS plugin for LiveKit Agents.
Provides STT (speech-to-text) and TTS (text-to-speech) classes that integrate with LiveKit's agent framework, backed by the Shunyalabs Python SDK.
Installation
pip install livekit-plugins-shunyalabs
Authentication
Set your API key as an environment variable:
export SHUNYALABS_API_KEY="your-api-key"
Or pass it directly:
stt = shunyalabs.STT(api_key="your-api-key")
tts = shunyalabs.TTS(api_key="your-api-key")
Quick Start
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs, silero
session = AgentSession(
stt=shunyalabs.STT(language="en"),
tts=shunyalabs.TTS(speaker="Rajesh", style="<Neutral>"),
vad=silero.VAD.load(),
)
STT (Speech-to-Text)
shunyalabs.STT
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
None |
API key. Falls back to SHUNYALABS_API_KEY env var. |
language |
str |
"auto" |
BCP-47 language code or "auto" for auto-detection. |
api_url |
str |
https://asr.shunyalabs.ai |
REST batch endpoint base URL. |
ws_url |
str |
wss://asr.shunyalabs.ai/ws |
WebSocket streaming endpoint URL. |
Capabilities
| Capability | Supported |
|---|---|
| Streaming (real-time) | Yes |
| Interim results | Yes |
| Offline/batch recognition | Yes |
Streaming STT
Real-time transcription over WebSocket. Audio frames from LiveKit are forwarded to the Shunyalabs ASR gateway; transcription events are pushed back as SpeechEvents.
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs, silero
session = AgentSession(
stt=shunyalabs.STT(language="en"),
vad=silero.VAD.load(),
)
@session.on("user_speech_committed")
def on_speech(ev):
print(f"User said: {ev.transcript}")
Event mapping:
| Shunyalabs Event | LiveKit SpeechEventType |
|---|---|
PARTIAL |
INTERIM_TRANSCRIPT |
FINAL_SEGMENT |
FINAL_TRANSCRIPT + END_OF_SPEECH |
FINAL |
FINAL_TRANSCRIPT + RECOGNITION_USAGE |
Batch STT
Single-shot transcription of an audio buffer. Uses POST /v1/audio/transcriptions via the SDK's AsyncBatchASR.
from livekit.plugins import shunyalabs
stt = shunyalabs.STT(language="en")
# In an agent context:
event = await stt.recognize(audio_buffer)
print(event.alternatives[0].text)
TTS (Text-to-Speech)
shunyalabs.TTS
| Parameter | Type | Default | Description |
|---|---|---|---|
api_key |
str |
None |
API key. Falls back to SHUNYALABS_API_KEY env var. |
api_url |
str |
https://tts.shunyalabs.ai |
HTTP batch endpoint base URL. |
ws_url |
str |
wss://tts.shunyalabs.ai/ws |
WebSocket streaming endpoint URL. |
model |
str |
"zero-indic" |
TTS model name. |
voice |
str |
"Rajesh" |
Voice name for the API. |
speaker |
str |
"Rajesh" |
Speaker name prefix for text formatting. |
style |
str |
"<Neutral>" |
Emotion style tag. See Style Tags. |
language |
str |
"en" |
Language code for transliteration. |
sample_rate |
int |
16000 |
Output audio sample rate in Hz. |
output_format |
str |
"pcm" |
Audio format ("pcm", "wav", "mp3", "ogg_opus", "flac"). |
speed |
float |
1.0 |
Speaking speed multiplier (0.25–4.0). |
Style Tags
| Tag | Description |
|---|---|
<Neutral> |
Neutral tone |
<Happy> |
Happy/cheerful |
<Sad> |
Sad/melancholic |
<Angry> |
Angry/intense |
<Fearful> |
Fearful/anxious |
<Surprised> |
Surprised/excited |
<Disgust> |
Disgusted |
<News> |
News anchor style |
<Conversational> |
Casual conversational |
<Narrative> |
Storytelling/narration |
<Enthusiastic> |
Enthusiastic/energetic |
Text Formatting
The plugin automatically formats text as "<Style> text" before sending to the API. For example:
tts = shunyalabs.TTS(speaker="Rajesh", style="<Happy>")
# Input: "Welcome to our platform"
# Sent: "<Happy> Welcome to our platform"
Streaming TTS
Token-by-token streaming. Collects text tokens, then synthesizes on flush via WebSocket streaming.
from livekit.agents import AgentSession
from livekit.plugins import shunyalabs
session = AgentSession(
tts=shunyalabs.TTS(
speaker="Nisha",
style="<Conversational>",
model="zero-indic",
voice="Nisha",
),
)
Chunked (Batch) TTS
Single text → audio synthesis via HTTP batch API.
from livekit.plugins import shunyalabs
tts = shunyalabs.TTS(speaker="Varun", voice="Varun")
stream = tts.synthesize("Hello, how can I help you today?")
Full Agent Example
import asyncio
from livekit import api
from livekit.agents import AgentSession, Agent, RoomInputOptions
from livekit.plugins import shunyalabs, silero
class MyAgent(Agent):
def __init__(self):
super().__init__(
instructions="You are a helpful voice assistant.",
)
async def entrypoint(ctx):
session = AgentSession(
stt=shunyalabs.STT(language="auto"),
tts=shunyalabs.TTS(
model="zero-indic",
voice="Rajesh",
speaker="Rajesh",
style="<Conversational>",
),
vad=silero.VAD.load(),
)
await session.start(
agent=MyAgent(),
room=ctx.room,
room_input_options=RoomInputOptions(),
)
Multilingual Example
# Hindi speaker
tts_hindi = shunyalabs.TTS(
speaker="Rajesh",
voice="Rajesh",
language="hi",
style="<Neutral>",
)
# English speaker
tts_english = shunyalabs.TTS(
speaker="Varun",
voice="Varun",
language="en",
style="<Conversational>",
)
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file livekit_plugins_shunyalabs-1.0.0.tar.gz.
File metadata
- Download URL: livekit_plugins_shunyalabs-1.0.0.tar.gz
- Upload date:
- Size: 11.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
33c9ef01f7e6617f06fc0c1fc380ce2095c66031e832ffdd0eb15f2c726eede9
|
|
| MD5 |
6186016fe4abb7dbaf517c0809897aa7
|
|
| BLAKE2b-256 |
251264ab2e85caf0c4ef08372901fb0671b773f75b1942fe6a75814f6b3f5927
|
File details
Details for the file livekit_plugins_shunyalabs-1.0.0-py3-none-any.whl.
File metadata
- Download URL: livekit_plugins_shunyalabs-1.0.0-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2444a46e25984d95a4a76e3ea6e7354bffef85fdb46e622e9d48d7d10072d7a
|
|
| MD5 |
c585c36e43f825445452d182ea14f6d3
|
|
| BLAKE2b-256 |
01408373dbfd039bf68531e25987e106cec4db0fd09a56c79eaaa94fcee190b0
|