GL Speech Python Client - Language binding SDK for Prosa Speech API
Project description
GL Speech SDK
A Python SDK for interacting with the GL Speech API, providing speech-to-text (STT), text-to-speech (TTS), and webhook management capabilities.
Prerequisites
- Python ≥3.11,<3.13 (3.11 or 3.12)
Installation
pip install gl-speech
Or with uv:
uv add gl-speech
Quick Start
STT and TTS use different API keys and base URLs. Create two clients. Webhooks for STT jobs use the STT client; webhooks for TTS jobs use the TTS client.
from gl_speech_sdk import SpeechClient
stt_client = SpeechClient(api_key="your-stt-api-key", base_url="https://api.prosa.ai/v2/speech/")
tts_client = SpeechClient(api_key="your-tts-api-key", base_url="https://api.prosa.ai/v2/speech/")
# Speech-to-Text
result = stt_client.stt.transcribe(
data="<base64-encoded-audio>",
model="stt-general",
wait=True
)
print(result.result)
# Text-to-Speech
result = tts_client.tts.synthesize(
text="Hello, world!",
model="tts-dimas-formal",
wait=True
)
print(result.result)
# Webhooks: STT job events use STT client, TTS job events use TTS client
stt_endpoint = stt_client.webhooks.create_endpoint(
url="https://your-server.com/webhook-stt",
event_filters=["stt.job.completed"]
)
tts_endpoint = tts_client.webhooks.create_endpoint(
url="https://your-server.com/webhook-tts",
event_filters=["tts.job.completed"]
)
Configuration
Environment Variables
Set separate credentials for STT and TTS:
GLSPEECH_STT_API_KEY: API key for Speech-to-TextGLSPEECH_STT_BASE_URL: Base URL for STT (default:https://api.prosa.ai/v2/speech/)GLSPEECH_TTS_API_KEY: API key for Text-to-SpeechGLSPEECH_TTS_BASE_URL: Base URL for TTS (default:https://api.prosa.ai/v2/speech/)
Webhook management is per job type: use the STT client for STT job webhooks, the TTS client for TTS job webhooks.
Client Initialization
from gl_speech_sdk import SpeechClient
# Two clients (STT and TTS have different keys)
stt_client = SpeechClient(
api_key="your-stt-api-key",
base_url="https://api.prosa.ai/v2/speech/",
timeout=60.0,
default_headers={"X-Custom-Header": "value"}
)
tts_client = SpeechClient(
api_key="your-tts-api-key",
base_url="https://api.prosa.ai/v2/speech/",
timeout=60.0,
default_headers={"X-Custom-Header": "value"}
)
# Or from environment variables
import os
os.environ["GLSPEECH_STT_API_KEY"] = "your-stt-api-key"
os.environ["GLSPEECH_STT_BASE_URL"] = "https://api.prosa.ai/v2/speech/"
os.environ["GLSPEECH_TTS_API_KEY"] = "your-tts-api-key"
os.environ["GLSPEECH_TTS_BASE_URL"] = "https://api.prosa.ai/v2/speech/"
stt_client = SpeechClient(api_key=os.getenv("GLSPEECH_STT_API_KEY"), base_url=os.getenv("GLSPEECH_STT_BASE_URL"))
tts_client = SpeechClient(api_key=os.getenv("GLSPEECH_TTS_API_KEY"), base_url=os.getenv("GLSPEECH_TTS_BASE_URL"))
Speech-to-Text (STT)
Use stt_client for all STT operations.
List Available Models
models = stt_client.stt.list_models()
for model in models:
print(f"{model['name']}: {model['label']}")
Transcribe Audio
# Synchronous (wait for result)
result = stt_client.stt.transcribe(
model="stt-general",
wait=True,
data="<base64-encoded-audio>",
label="My audio file"
)
print(result.result)
# Asynchronous (get job_id, poll later)
result = stt_client.stt.transcribe(
model="stt-general",
wait=False,
uri="https://example.com/audio.wav"
)
job_id = result.job_id
# Check status
status = stt_client.stt.get_status(job_id)
print(f"Status: {status.status}, Progress: {status.progress}")
# Get result when complete
result = stt_client.stt.get_job(job_id)
print(result.result)
Advanced Configuration
result = stt_client.stt.transcribe(
model="stt-general",
wait=True,
data="<base64-encoded-audio>",
speaker_count=2, # Expected number of speakers
include_filler=True, # Include filler words
auto_punctuation=True, # Auto-add punctuation
enable_spoken_numerals=True, # Convert "one" to "1"
enable_speech_insights=True, # Speech analytics
enable_voice_insights=True, # Voice analytics
)
List and Manage Jobs
# List jobs with filters
jobs = stt_client.stt.list_jobs(
page=1,
per_page=10,
from_date="2024-01-01",
until_date="2024-01-31",
query_text="hello"
)
# Archive a job
stt_client.stt.archive(job_id)
Text-to-Speech (TTS)
Use tts_client for all TTS operations.
List Available Models
models = tts_client.tts.list_models()
for model in models:
print(f"{model['name']}: {model['voice']} ({model['gender']})")
Synthesize Speech
# Synchronous (wait for result)
result = tts_client.tts.synthesize(
text="Hello, world!",
model="tts-dimas-formal",
wait=True
)
audio_data = result.result["data"] # Base64-encoded audio
# Get as signed URL instead
result = tts_client.tts.synthesize(
text="Hello, world!",
model="tts-dimas-formal",
wait=True,
as_signed_url=True
)
audio_url = result.result["path"]
# Asynchronous
result = tts_client.tts.synthesize(
text="Long text content...",
model="tts-dimas-formal",
wait=False
)
job_id = result.job_id
# Poll for completion
result = tts_client.tts.get_job(job_id, as_signed_url=True)
Advanced Configuration
result = tts_client.tts.synthesize(
text="Hello, world!",
model="tts-dimas-formal",
wait=True,
pitch=0.5, # Pitch adjustment (-1.0 to 1.0)
tempo=1.2, # Speed adjustment (0.5 to 2.0)
audio_format="mp3", # "opus", "mp3", or "wav"
label="My synthesis"
)
List and Manage Jobs
# List jobs
jobs = tts_client.tts.list_jobs(page=1, per_page=10)
# Count jobs
count = tts_client.tts.count_jobs(from_date="2024-01-01")
# Get job status
status = tts_client.tts.get_status(job_id)
# Archive a job
tts_client.tts.archive(job_id)
Webhook Management
Webhooks and API keys
Webhook management is split by job type:
- STT job webhooks (e.g.
stt.job.completed) use the STT API key and base URL → usestt_client.webhooks. - TTS job webhooks (e.g.
tts.job.completed) use the TTS API key and base URL → usetts_client.webhooks.
Create and manage endpoints on the client that matches the events you want to receive.
Create a Webhook Endpoint
# Endpoint for STT job events (uses STT key)
stt_endpoint = stt_client.webhooks.create_endpoint(
url="https://your-server.com/webhook-stt",
event_filters=["stt.job.completed"],
ssl_verification=True
)
# Endpoint for TTS job events (uses TTS key)
tts_endpoint = tts_client.webhooks.create_endpoint(
url="https://your-server.com/webhook-tts",
event_filters=["tts.job.completed"],
ssl_verification=True
)
print(f"STT Endpoint ID: {stt_endpoint.id}, Secret: {stt_endpoint.secrets[0].key}")
print(f"TTS Endpoint ID: {tts_endpoint.id}, Secret: {tts_endpoint.secrets[0].key}")
List Endpoints
# List STT webhook endpoints
stt_endpoints = stt_client.webhooks.list_endpoints()
# List TTS webhook endpoints
tts_endpoints = tts_client.webhooks.list_endpoints()
for ep in tts_endpoints:
print(f"{ep.id}: {ep.url}")
Update and Delete Endpoints
# Update/delete on the same client you used to create (STT or TTS)
stt_client.webhooks.update_endpoint(
endpoint_id="endpoint-123",
url="https://your-server.com/new-webhook",
event_filters=[]
)
stt_client.webhooks.delete_endpoint("endpoint-123")
Rotate Secrets
tts_client.webhooks.rotate_secret(
endpoint_id="endpoint-123",
days=3, # Old secret valid for 3 days
hours=0
)
Event Management
# List events (from the client whose webhooks you're querying)
events = tts_client.webhooks.list_events(
from_date="2024-01-01",
to_date="2024-01-31"
)
event = tts_client.webhooks.get_event("event-123")
print(event.data)
Delivery Management
deliveries = tts_client.webhooks.list_deliveries("endpoint-123")
ticket = tts_client.webhooks.replay_delivery("delivery-123")
tickets = tts_client.webhooks.replay_failed_deliveries("endpoint-123")
ticket = tts_client.webhooks.test_endpoint("endpoint-123")
Error Handling
import httpx
from gl_speech_sdk import SpeechClient
stt_client = SpeechClient(api_key="your-stt-api-key", base_url="https://api.prosa.ai/v2/speech/")
try:
result = stt_client.stt.transcribe(model="stt-general", data="invalid")
except httpx.HTTPStatusError as e:
print(f"HTTP Error: {e.response.status_code}")
print(f"Response: {e.response.text}")
except ValueError as e:
print(f"Validation Error: {e}")
API Reference
License
MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file gl_speech-0.0.1b9.tar.gz.
File metadata
- Download URL: gl_speech-0.0.1b9.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.8.24
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d248779c1a58fe8a538b6c1a37538dae7b04ae2d0d00f85bfb4879021921193
|
|
| MD5 |
83116a30bf8152be7f66c18b74645ce3
|
|
| BLAKE2b-256 |
b1b782bc9368946151febf4f8ae32469af1faa4c30ce9eb2abb1ee0bb541b3cf
|