Python client for Gnani's Vachana speech AI platform (STT, TTS, and more)
Project description
gnani-vachana
Official Python client for Vachana Speech APIs by Gnani.ai. Build multilingual voice workflows with Speech-to-Text (STT) and Text-to-Speech (TTS) across REST, SSE streaming, and real-time WebSockets.
Vachana is a production-ready speech platform with high-accuracy STT and low-latency TTS for Indian languages, including multilingual and code-switching scenarios.
Installation
pip install gnani-vachana
Requires Python 3.9+.
Quick Start
STT REST (file-based transcription)
from gnani.stt import GnaniSTTClient
client = GnaniSTTClient(
organization_id="your-organization-id",
api_key="your-api-key",
user_id="your-user-id",
)
result = client.transcribe("audio.wav", language_code="hi-IN")
print(result["transcript"])
Realtime Streaming (WebSocket)
import asyncio
from gnani.stt import GnaniSTTStreamClient, StreamTranscriptEvent
async def main():
async with GnaniSTTStreamClient(api_key="your-api-key", language_code="hi-IN") as stream:
# Send audio chunks (raw PCM, 16-bit LE, 16 kHz, mono)
with open("audio.pcm", "rb") as f:
while chunk := f.read(1024):
await stream.send_audio(chunk)
await asyncio.sleep(0.032) # real-time pacing (32 ms per frame)
# Iterate over events
async for event in stream:
if isinstance(event, StreamTranscriptEvent):
print(event.text)
asyncio.run(main())
TTS REST (single response)
from gnani.tts import GnaniTTSClient, AudioConfig
client = GnaniTTSClient(api_key="your-api-key")
audio = client.synthesize(
"नमस्ते, आप कैसे हैं?",
voice="sia",
audio_config=AudioConfig(sample_rate=44100, encoding="linear_pcm", container="wav"),
)
with open("tts_output.wav", "wb") as f:
f.write(audio)
TTS Realtime (WebSocket)
import asyncio
from gnani.tts import GnaniTTSRealtimeClient
async def main():
async with GnaniTTSRealtimeClient(api_key="your-api-key") as client:
with open("tts_realtime.wav", "wb") as f:
async for chunk in client.synthesize("Hello from Gnani TTS", voice="sia"):
f.write(chunk)
asyncio.run(main())
Authentication
STT REST API
The REST API uses header-based authentication. Every request requires three credentials:
| Parameter | Header | Description |
|---|---|---|
organization_id |
X-Organization-ID |
Your organisation identifier |
api_key |
X-API-Key-ID |
Secret key for authentication |
user_id |
X-API-User-ID |
Your user / organisation name |
STT Realtime Streaming API
The WebSocket streaming API requires a single API key:
| Parameter | Header | Description |
|---|---|---|
api_key |
x-api-key-id |
API key identifier for authentication |
TTS API (REST, SSE, Realtime)
All TTS interfaces require a single API key:
| Parameter | Header | Description |
|---|---|---|
api_key |
X-API-Key-ID |
API key identifier for authentication |
Obtaining Credentials
Email speechstack@gnani.ai with your name, company, and use case. Credentials are typically provisioned within 1 business day, and all new accounts receive free credits -- no credit card required.
Passing Credentials
Option 1 -- Constructor arguments:
from gnani.stt import GnaniSTTClient, GnaniSTTStreamClient
from gnani.tts import GnaniTTSClient, GnaniTTSRealtimeClient, GnaniTTSStreamClient
# REST client
client = GnaniSTTClient(
organization_id="your-organization-id",
api_key="your-api-key",
user_id="your-user-id",
)
# Streaming client
stream = GnaniSTTStreamClient(api_key="your-api-key")
# TTS clients
tts_rest = GnaniTTSClient(api_key="your-api-key")
tts_stream = GnaniTTSStreamClient(api_key="your-api-key")
tts_realtime = GnaniTTSRealtimeClient(api_key="your-api-key")
Option 2 -- Environment variables:
# REST client credentials
export GNANI_ORGANIZATION_ID="your-organization-id"
export GNANI_API_KEY="your-api-key"
export GNANI_USER_ID="your-user-id"
from gnani.stt import GnaniSTTClient, GnaniSTTStreamClient
from gnani.tts import GnaniTTSClient
client = GnaniSTTClient() # picks up all three env vars
stream = GnaniSTTStreamClient() # picks up GNANI_API_KEY
tts = GnaniTTSClient() # picks up GNANI_API_KEY
Supported Languages
REST API
| Language | Code | Native Script |
|---|---|---|
| Bengali | bn-IN |
বাংলা |
| English (India) | en-IN |
Latin |
| Gujarati | gu-IN |
ગુજરાતી |
| Hindi | hi-IN |
हिन्दी |
| Kannada | kn-IN |
ಕನ್ನಡ |
| Malayalam | ml-IN |
മലയാളം |
| Marathi | mr-IN |
मराठी |
| Punjabi | pa-IN |
ਪੰਜਾਬੀ |
| Tamil | ta-IN |
தமிழ் |
| Telugu | te-IN |
తెలుగు |
For multilingual / code-switching audio (e.g. Hindi-English mix), pass a comma-separated code:
result = client.transcribe("meeting.wav", language_code="en-IN,hi-IN")
Realtime Streaming API
All languages above plus experimental codes:
| Language | Code | Script |
|---|---|---|
| Hinglish (Latin) | en-hi-IN-latn |
Latin (experimental) |
| Hinglish (Code-mixed) | en-hi-in-cm |
Latin + Devanagari (experimental) |
| Auto-detect | AUTO_DETECT |
All supported (experimental) |
from gnani.stt import GnaniSTTStreamClient
# Hinglish (Latin script)
stream = GnaniSTTStreamClient(api_key="key", language_code="en-hi-IN-latn")
# Auto-detect language
stream = GnaniSTTStreamClient(api_key="key", language_code=GnaniSTTStreamClient.AUTO_DETECT)
REST Usage
Transcribe a file by path
result = client.transcribe("meeting.wav", language_code="en-IN")
print(result["transcript"])
Transcribe from a file object
with open("meeting.mp3", "rb") as f:
result = client.transcribe(f, language_code="ta-IN")
Transcribe raw bytes
audio_bytes = download_audio_from_somewhere()
result = client.transcribe_bytes(
audio_bytes, filename="clip.wav", language_code="kn-IN"
)
Custom request ID
result = client.transcribe(
"call.flac", language_code="hi-IN", request_id="my-trace-123"
)
List supported languages
for code, name in GnaniSTTClient.supported_languages().items():
print(f"{code}: {name}")
Realtime Streaming Usage
Connection Flow
- Client opens a WebSocket connection to
wss://api.vachana.ai/stt/v3/streamwith auth headers. - Server sends a
connectedevent with the active configuration. - Client sends binary PCM audio frames (1024 bytes each = 512 samples at 16-bit).
- Server detects speech via VAD and responds with
processingandtranscriptevents. - Either side may close the connection at any time.
Audio Format
| Property | 16 kHz | 8 kHz |
|---|---|---|
| Encoding | PCM signed 16-bit LE | PCM signed 16-bit LE |
| Sample Rate | 16,000 Hz | 8,000 Hz |
| Channels | 1 (mono) | 1 (mono) |
| Chunk Size | 512 samples (32 ms) | 512 samples (64 ms) |
| Frame Bytes | 1,024 | 1,024 |
Using the async context manager
import asyncio
from gnani.stt import GnaniSTTStreamClient, StreamTranscriptEvent
async def main():
async with GnaniSTTStreamClient(
api_key="your-api-key",
language_code="hi-IN",
sample_rate=16000,
) as stream:
print(f"Connected! Sample rate: {stream.connected_config.sample_rate}")
with open("audio.pcm", "rb") as f:
while chunk := f.read(1024):
await stream.send_audio(chunk)
await asyncio.sleep(0.032)
async for event in stream:
if isinstance(event, StreamTranscriptEvent):
print(f"[{event.segment_index}] {event.text}")
print(f" Duration: {event.audio_duration_ms}ms, Latency: {event.latency}ms")
asyncio.run(main())
Manual connect / close
import asyncio
from gnani.stt import GnaniSTTStreamClient, StreamTranscriptEvent
async def main():
stream = GnaniSTTStreamClient(api_key="your-api-key")
config = await stream.connect()
print(f"Server ready: {config.message}")
await stream.send_audio(audio_chunk)
transcripts = await stream.close()
for t in transcripts:
print(t.text)
asyncio.run(main())
High-level stream_audio helper with callbacks
import asyncio
from gnani.stt import GnaniSTTStreamClient, StreamTranscriptEvent, StreamProcessingEvent
async def main():
async with GnaniSTTStreamClient(api_key="your-api-key") as stream:
with open("audio.pcm", "rb") as f:
transcripts = await stream.stream_audio(
f,
on_transcript=lambda t: print(f"Transcript: {t.text}"),
on_processing=lambda p: print(f"Processing at {p.timestamp}..."),
realtime_pace=True,
)
print(f"Total segments: {len(transcripts)}")
asyncio.run(main())
Using 8 kHz audio
stream = GnaniSTTStreamClient(
api_key="your-api-key",
language_code="en-IN",
sample_rate=8000,
)
Event Types
All events are typed dataclasses:
| Event | Fields | Description |
|---|---|---|
StreamConnectedEvent |
message, timestamp, sample_rate, chunk_size, raw |
Handshake confirmation with server config |
StreamProcessingEvent |
timestamp, raw |
VAD detected end-of-speech, transcribing |
StreamTranscriptEvent |
text, audio_duration_ms, segment_id, segment_index, latency, timestamp, raw |
Completed transcription for a speech segment |
StreamErrorEvent |
message, timestamp, raw |
Server-side error |
Accessing the raw JSON payload
Every event includes a raw field with the full server JSON:
async for event in stream:
print(event.raw) # dict with the complete server response
Text-to-Speech Usage
TTS REST
from gnani.tts import GnaniTTSClient
client = GnaniTTSClient(api_key="your-api-key")
audio = client.synthesize("यह एक टेस्ट है", voice="sia")
with open("tts_rest.wav", "wb") as f:
f.write(audio)
TTS Streaming (SSE)
from gnani.tts import GnaniTTSStreamClient
client = GnaniTTSStreamClient(api_key="your-api-key")
with open("tts_sse.wav", "wb") as f:
for chunk in client.synthesize_stream("Streaming TTS response", voice="raju"):
f.write(chunk)
TTS Realtime (WebSocket)
import asyncio
from gnani.tts import GnaniTTSRealtimeClient
async def main():
async with GnaniTTSRealtimeClient(api_key="your-api-key") as client:
audio = await client.synthesize_and_collect("Realtime TTS response", voice="neha")
with open("tts_realtime.wav", "wb") as f:
f.write(audio)
asyncio.run(main())
TTS Voices
from gnani.tts import GnaniTTSClient
print(GnaniTTSClient.supported_voices())
Audio Requirements
REST API
| Constraint | Value |
|---|---|
| Formats | WAV, MP3, FLAC, OGG, M4A |
| Max duration | 60 seconds |
| Channels | Mono or stereo |
| Sample rate | Automatically converted to 16 kHz mono |
Realtime Streaming
| Constraint | Value |
|---|---|
| Encoding | Raw PCM, signed 16-bit little-endian |
| Sample rate | 16,000 Hz or 8,000 Hz |
| Channels | 1 (mono) |
| Frame size | 1,024 bytes (512 samples) |
| Pacing | Send frames at real-time cadence for best VAD accuracy |
Response Format
REST
{
"success": true,
"request_id": "req_abc123",
"timestamp": "20251226_143052.123",
"transcript": "नमस्ते, आप कैसे हैं?"
}
Realtime Streaming
Connected:
{
"type": "connected",
"message": "STT service ready — VAD service connected",
"timestamp": "2024-01-15T10:30:00.000Z",
"config": { "sample_rate": 16000, "chunk_size": 512 }
}
Transcript:
{
"type": "transcript",
"timestamp": "2024-01-15T10:30:05.987Z",
"text": "Hello, how are you today?",
"audio_duration_ms": 2340,
"segment_id": "<segment_id>",
"segment_index": "<segment_index>",
"latency": 320
}
Error Handling
from gnani.stt import (
AuthenticationError,
InvalidAudioError,
APIError,
StreamConnectionError,
StreamClosedError,
StreamError,
)
# REST errors
try:
result = client.transcribe("audio.wav", language_code="hi-IN")
except AuthenticationError:
print("Check your credentials")
except InvalidAudioError as e:
print(f"Bad audio file: {e}")
except APIError as e:
print(f"API error {e.status_code}: {e}")
# Streaming errors
try:
async with GnaniSTTStreamClient(api_key="key") as stream:
await stream.send_audio(chunk)
except StreamConnectionError as e:
print(f"Connection failed: {e}")
except StreamClosedError as e:
print(f"Stream already closed: {e}")
except StreamError as e:
print(f"Server error: {e} (at {e.timestamp})")
Documentation
Full API reference and guides are available at docs.inya.ai/vachana.
License
This project is licensed under the MIT License -- see the LICENSE file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gnani_vachana-0.2.2.tar.gz.
File metadata
- Download URL: gnani_vachana-0.2.2.tar.gz
- Upload date:
- Size: 21.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad89625e1be9d6f80cba294f28feacc6738e6280f78b7aeecfc1b433a6278960
|
|
| MD5 |
62201a237cf31ac6440bb10192cc07a2
|
|
| BLAKE2b-256 |
5a4f452def73ea0c8a38b94833353f986780e96df98994f8e4542ae2df43d3af
|
Provenance
The following attestation bundles were made for gnani_vachana-0.2.2.tar.gz:
Publisher:
workflow.yml on Gnani-AI-Mintlify/Gnani-Vachana
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gnani_vachana-0.2.2.tar.gz -
Subject digest:
ad89625e1be9d6f80cba294f28feacc6738e6280f78b7aeecfc1b433a6278960 - Sigstore transparency entry: 1393370978
- Sigstore integration time:
-
Permalink:
Gnani-AI-Mintlify/Gnani-Vachana@1eeddeee5d31a8a8e952757ef247dbd5f5fdd321 -
Branch / Tag:
refs/tags/v.0.2.2 - Owner: https://github.com/Gnani-AI-Mintlify
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@1eeddeee5d31a8a8e952757ef247dbd5f5fdd321 -
Trigger Event:
release
-
Statement type:
File details
Details for the file gnani_vachana-0.2.2-py3-none-any.whl.
File metadata
- Download URL: gnani_vachana-0.2.2-py3-none-any.whl
- Upload date:
- Size: 22.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ea3b17c51f99c0318e826d7877b7367e79081dbdbc8f88a47376a1abf970b71
|
|
| MD5 |
a2a8427851d9f0f0601676ffa4365dfe
|
|
| BLAKE2b-256 |
6ee41dcee219c967f826c1282f10e53bb11d75ff180a37d2bae2a652974948ce
|
Provenance
The following attestation bundles were made for gnani_vachana-0.2.2-py3-none-any.whl:
Publisher:
workflow.yml on Gnani-AI-Mintlify/Gnani-Vachana
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
gnani_vachana-0.2.2-py3-none-any.whl -
Subject digest:
4ea3b17c51f99c0318e826d7877b7367e79081dbdbc8f88a47376a1abf970b71 - Sigstore transparency entry: 1393370999
- Sigstore integration time:
-
Permalink:
Gnani-AI-Mintlify/Gnani-Vachana@1eeddeee5d31a8a8e952757ef247dbd5f5fdd321 -
Branch / Tag:
refs/tags/v.0.2.2 - Owner: https://github.com/Gnani-AI-Mintlify
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
workflow.yml@1eeddeee5d31a8a8e952757ef247dbd5f5fdd321 -
Trigger Event:
release
-
Statement type: