Python SDK for Soniox speech-to-text API
Project description
Soniox Python SDK
Official Python SDK for Soniox Speech-to-Text API. Built with httpx for both synchronous and asynchronous support.
Features
- 🎯 Complete API Coverage: Full support for Soniox REST API
- ⚡ Async & Sync: Full support for both synchronous and asynchronous operations
- 🔒 Type Safe: Built with Pydantic v2 for robust type checking and validation
- 📝 Comprehensive Logging: Built-in logging with the
sonioxlogger - 🌍 60+ Languages: Transcribe speech in multiple languages with language hints
- 🎭 Speaker Diarization: Identify different speakers in audio
- 🔍 Language Identification: Automatic language detection
- 📊 Word-Level Timestamps: Get precise timing for each word
- 🎯 Context Support: Improve accuracy with domain-specific context
Installation
pip install soniox
Quick Start
Authentication
Set your API key as an environment variable:
export SONIOX_API_KEY="your-api-key-here"
Or pass it directly when initializing the client:
from soniox import SonioxClient
client = SonioxClient(api_key="your-api-key-here")
Basic Usage
Transcribe an Audio File
Synchronous:
import time
from soniox import SonioxClient
client = SonioxClient()
# Submit transcription job
job = client.transcribe_file("path/to/audio.wav")
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")
# Poll for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
time.sleep(1)
# Get the transcript
result = client.get_transcription_result(job.id)
print(f"Transcript: {result.text}")
print(f"Tokens: {len(result.tokens)}")
Asynchronous:
import asyncio
from soniox import SonioxClient
async def transcribe():
client = SonioxClient()
# Submit transcription job
job = await client.transcribe_file_async("path/to/audio.wav")
print(f"Job ID: {job.id}")
# Poll for completion
while True:
job = await client.get_transcription_job_async(job.id)
if job.status == "completed":
break
await asyncio.sleep(1)
# Get the transcript
result = await client.get_transcription_result_async(job.id)
print(f"Transcript: {result.text}")
asyncio.run(transcribe())
Transcribe with Custom Configuration
You can pass configuration options either as a TranscriptionConfig object or as keyword arguments:
from soniox import SonioxClient
from soniox.languages import Language
from soniox.types import TranscriptionConfig
client = SonioxClient()
# Using TranscriptionConfig
config = TranscriptionConfig(
model="stt-async-preview",
language_hints=[Language.en],
enable_speaker_diarization=True,
context="Medical terminology context"
)
job = client.transcribe_file("audio.wav", config=config)
# Or using kwargs
job = client.transcribe_file(
"audio.wav",
model="stt-async-preview",
enable_speaker_diarization=True
)
Advanced Features
Speaker Diarization
Identify different speakers in your audio:
import time
from soniox import SonioxClient
client = SonioxClient()
# Submit job with speaker diarization
job = client.transcribe_file(
"path/to/audio.wav",
enable_speaker_diarization=True
)
# Wait for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
time.sleep(1)
# Get results with speaker information
result = client.get_transcription_result(job.id)
for token in result.tokens:
if token.speaker:
print(f"Speaker {token.speaker}: {token.text}")
Language Identification
Automatically identify the language being spoken:
from soniox import SonioxClient
from soniox.languages import Language
client = SonioxClient()
job = client.transcribe_file(
"multilingual_audio.wav",
language_hints=[Language.en, Language.es, Language.fr],
enable_language_identification=True
)
Context for Improved Accuracy
Provide context to improve recognition of domain-specific terms:
from soniox import SonioxClient
client = SonioxClient()
job = client.transcribe_file(
"medical_audio.wav",
context="Medical terminology: hypertension, cardiovascular, stethoscope"
)
Configuration
Client Options
from soniox import SonioxClient
client = SonioxClient(
api_key="your-api-key", # API key (or use SONIOX_API_KEY env var)
base_url="https://api.soniox.com", # Custom base URL (optional)
timeout=60.0 # Request timeout in seconds
)
Logging
The SDK uses Python's standard logging module with the logger name soniox:
import logging
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("soniox")
logger.setLevel(logging.DEBUG)
# Or configure it your way
import logging
handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)
logger = logging.getLogger("soniox")
logger.addHandler(handler)
logger.setLevel(logging.INFO)
API Reference
SonioxClient
Main client for interacting with Soniox API.
Methods
transcribe_file(file_path, config=None, **kwargs) → TranscriptionJob
Submit an audio file for transcription.
Parameters:
file_path(str): Path to audio fileconfig(TranscriptionConfig, optional): Configuration object**kwargs: Configuration options (used if config is None)model(str): Model to use (default: "stt-async-preview")language_hints(list[Language]): Language hints for better accuracyenable_speaker_diarization(bool): Enable speaker diarizationenable_language_identification(bool): Enable language identificationcontext(str): Context for improved accuracywebhook_url(str): Webhook URL for completion notificationclient_reference_id(str): Your reference ID
Returns: TranscriptionJob - Job object with status information
Raises:
FileNotFoundError: If file doesn't existSonioxAPIError: If API returns an error
get_transcription_job(job_id) → TranscriptionJob
Get the status of a transcription job.
Parameters:
job_id(str): Job ID fromtranscribe_file()
Returns: TranscriptionJob - Updated job status
get_transcription_result(job_id) → TranscriptionResult
Get the transcript once the job is completed.
Parameters:
job_id(str): Job ID from completed transcription
Returns: TranscriptionResult - Transcript with tokens
Raises:
SonioxAPIError: If job is not completed or not found
transcribe_file_async(file_path, config=None, **kwargs) → TranscriptionJob
Async version of transcribe_file().
get_transcription_job_async(job_id) → TranscriptionJob
Async version of get_transcription_job().
get_transcription_result_async(job_id) → TranscriptionResult
Async version of get_transcription_result().
Models
TranscriptionJob
Transcription job status and metadata.
Fields:
id(str): Job ID (UUID)status(TranscriptionJobStatus): Job status ("queued", "processing", "completed", "error")created_at(datetime): Job creation timestampfilename(str): Original filenamefile_id(str | None): Uploaded file IDaudio_url(str | None): Audio URL if providedaudio_duration_ms(int | None): Audio duration in millisecondserror_message(str | None): Error message if failed- All configuration fields from
TranscriptionConfig
TranscriptionResult
Transcription result with full transcript.
Fields:
id(str): Transcript ID (matches job ID)text(str): Full transcribed texttokens(list[Token]): Word-level tokens with timing
Token
Word-level transcription token.
Fields:
text(str): Token textstart_ms(int): Start time in millisecondsend_ms(int): End time in millisecondsconfidence(float): Confidence score (0-1)speaker(str | None): Speaker ID if diarization enabled
TranscriptionConfig
Configuration for transcription jobs.
Fields:
model(str): Model to use (default: "stt-async-preview")language_hints(list[Language] | None): Language hintsenable_language_identification(bool): Enable language detectionenable_speaker_diarization(bool): Enable speaker diarizationcontext(str | None): Context for improved accuracyclient_reference_id(str | None): Your reference IDwebhook_url(str | None): Webhook URLwebhook_auth_header_name(str | None): Webhook auth header namewebhook_auth_header_value(str | None): Webhook auth header value
FileUploadResponse
Response from file upload.
Fields:
id(str): File IDfilename(str): Original filenamesize(int): File size in bytescreated_at(datetime): Upload timestampclient_reference_id(str | None): Your reference ID
Exceptions
SonioxError: Base exception for all Soniox errorsSonioxAuthenticationError: Raised when authentication failsSonioxAPIError: Raised when API returns an error responseSonioxRateLimitError: Raised when rate limit is exceeded
Error Handling
import time
from soniox import SonioxClient
from soniox.exceptions import (
SonioxAPIError,
SonioxAuthenticationError,
SonioxRateLimitError,
)
client = SonioxClient()
try:
# Submit transcription
job = client.transcribe_file("audio.wav")
# Wait for completion
while True:
job = client.get_transcription_job(job.id)
if job.status == "completed":
break
elif job.status == "error":
print(f"Transcription failed: {job.error_message}")
break
time.sleep(1)
# Get result
if job.status == "completed":
result = client.get_transcription_result(job.id)
print(result.text)
except FileNotFoundError:
print("Audio file not found")
except SonioxAuthenticationError as e:
print(f"Authentication failed: {e}")
except SonioxRateLimitError as e:
print(f"Rate limit exceeded: {e}")
print(f"Status code: {e.status_code}")
except SonioxAPIError as e:
print(f"API error: {e}")
print(f"Status code: {e.status_code}")
print(f"Response: {e.response_body}")
Testing
Run tests with pytest:
# Install development dependencies
pip install -e ".[dev,test]"
# Run all tests
pytest
# Run with coverage
pytest --cov=src --cov-report=html --cov-report=term-missing
# Run specific test file
pytest tests/test_models.py
# Run with verbose output
pytest -v
See tests/README.md for more details on the test suite.
Development
# Clone the repository
git clone https://github.com/mahdikiani/soniox-sdk.git
cd soniox
# Install in editable mode with dev dependencies
pip install -e ".[dev,test]"
# Run linter
ruff check src/
# Run type checker
mypy src/
Resources
License
This project is licensed under the MIT License - see the LICENSE.txt file for details.
Support
- 📧 Email: mahdikiany@gmail.com
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
Changelog
See CHANGELOG.md for version history and updates.
Made with ❤️ by Mahdi Kiani
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file soniox_sdk-0.1.6.tar.gz.
File metadata
- Download URL: soniox_sdk-0.1.6.tar.gz
- Upload date:
- Size: 21.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
85770cc0c154a6e5259de2b8354486ff25c09593f5a62a5ce2b7ae401c75bfa2
|
|
| MD5 |
c8a4ba753117cf5fab41cf3ea86bbc73
|
|
| BLAKE2b-256 |
069def7f451df1d3f3a71963a5dbf17eecb9bd7104a9b8184f4b0a92e79e7d24
|
File details
Details for the file soniox_sdk-0.1.6-py3-none-any.whl.
File metadata
- Download URL: soniox_sdk-0.1.6-py3-none-any.whl
- Upload date:
- Size: 13.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6b2913735eb6489b6b23813f07a86a4438bbba4fe0ad9a70209be73fc7afb5aa
|
|
| MD5 |
7110eb3d39c6a98e54d2281dfb37c56b
|
|
| BLAKE2b-256 |
794e77bf69889070c322119c7aa9605eaf5296aeefabd01ace4c62c923eef723
|