Python SDK for Soniox speech-to-text API

These details have not been verified by PyPI

Project links

Project description

Soniox Python SDK

Official Python SDK for Soniox Speech-to-Text API. Built with httpx for both synchronous and asynchronous support.

Features

🎯 Complete API Coverage: Full support for Soniox REST API
⚡ Async & Sync: Full support for both synchronous and asynchronous operations
🔒 Type Safe: Built with Pydantic v2 for robust type checking and validation
📝 Comprehensive Logging: Built-in logging with the soniox logger
🌍 60+ Languages: Transcribe speech in multiple languages with language hints
🎭 Speaker Diarization: Identify different speakers in audio
🔍 Language Identification: Automatic language detection
📊 Word-Level Timestamps: Get precise timing for each word
🎯 Context Support: Improve accuracy with domain-specific context

Installation

pip install soniox

Quick Start

Authentication

Set your API key as an environment variable:

export SONIOX_API_KEY="your-api-key-here"

Or pass it directly when initializing the client:

from soniox import SonioxClient

client = SonioxClient(api_key="your-api-key-here")

Basic Usage

Transcribe an Audio File

Synchronous:

import time
from soniox import SonioxClient

client = SonioxClient()

# Submit transcription job
job = client.transcribe_file("path/to/audio.wav")
print(f"Job ID: {job.id}")
print(f"Status: {job.status}")

# Poll for completion
while True:
    job = client.get_transcription_job(job.id)
    if job.status == "completed":
        break
    time.sleep(1)

# Get the transcript
result = client.get_transcription_result(job.id)
print(f"Transcript: {result.text}")
print(f"Tokens: {len(result.tokens)}")

Asynchronous:

import asyncio
from soniox import SonioxClient

async def transcribe():
    client = SonioxClient()
    
    # Submit transcription job
    job = await client.transcribe_file_async("path/to/audio.wav")
    print(f"Job ID: {job.id}")
    
    # Poll for completion
    while True:
        job = await client.get_transcription_job_async(job.id)
        if job.status == "completed":
            break
        await asyncio.sleep(1)
    
    # Get the transcript
    result = await client.get_transcription_result_async(job.id)
    print(f"Transcript: {result.text}")

asyncio.run(transcribe())

Transcribe with Custom Configuration

You can pass configuration options either as a TranscriptionConfig object or as keyword arguments:

from soniox import SonioxClient
from soniox.languages import Language
from soniox.types import TranscriptionConfig

client = SonioxClient()

# Using TranscriptionConfig
config = TranscriptionConfig(
    model="stt-async-preview",
    language_hints=[Language.en],
    enable_speaker_diarization=True,
    context="Medical terminology context"
)
job = client.transcribe_file("audio.wav", config=config)

# Or using kwargs
job = client.transcribe_file(
    "audio.wav",
    model="stt-async-preview",
    enable_speaker_diarization=True
)

Advanced Features

Speaker Diarization

Identify different speakers in your audio:

import time
from soniox import SonioxClient

client = SonioxClient()

# Submit job with speaker diarization
job = client.transcribe_file(
    "path/to/audio.wav",
    enable_speaker_diarization=True
)

# Wait for completion
while True:
    job = client.get_transcription_job(job.id)
    if job.status == "completed":
        break
    time.sleep(1)

# Get results with speaker information
result = client.get_transcription_result(job.id)
for token in result.tokens:
    if token.speaker:
        print(f"Speaker {token.speaker}: {token.text}")

Language Identification

Automatically identify the language being spoken:

from soniox import SonioxClient
from soniox.languages import Language

client = SonioxClient()

job = client.transcribe_file(
    "multilingual_audio.wav",
    language_hints=[Language.en, Language.es, Language.fr],
    enable_language_identification=True
)

Context for Improved Accuracy

Provide context to improve recognition of domain-specific terms:

from soniox import SonioxClient

client = SonioxClient()

job = client.transcribe_file(
    "medical_audio.wav",
    context="Medical terminology: hypertension, cardiovascular, stethoscope"
)

Configuration

Client Options

from soniox import SonioxClient

client = SonioxClient(
    api_key="your-api-key",           # API key (or use SONIOX_API_KEY env var)
    base_url="https://api.soniox.com", # Custom base URL (optional)
    timeout=60.0                       # Request timeout in seconds
)

Logging

The SDK uses Python's standard logging module with the logger name soniox:

import logging

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger("soniox")
logger.setLevel(logging.DEBUG)

# Or configure it your way
import logging

handler = logging.StreamHandler()
handler.setLevel(logging.INFO)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
handler.setFormatter(formatter)

logger = logging.getLogger("soniox")
logger.addHandler(handler)
logger.setLevel(logging.INFO)

API Reference

SonioxClient

Main client for interacting with Soniox API.

Methods

`transcribe_file(file_path, config=None, **kwargs)` → `TranscriptionJob`

Submit an audio file for transcription.

Parameters:

file_path (str): Path to audio file
config (TranscriptionConfig, optional): Configuration object
**kwargs: Configuration options (used if config is None)
- model (str): Model to use (default: "stt-async-preview")
- language_hints (list[Language]): Language hints for better accuracy
- enable_speaker_diarization (bool): Enable speaker diarization
- enable_language_identification (bool): Enable language identification
- context (str): Context for improved accuracy
- webhook_url (str): Webhook URL for completion notification
- client_reference_id (str): Your reference ID

Returns: TranscriptionJob - Job object with status information

Raises:

FileNotFoundError: If file doesn't exist
SonioxAPIError: If API returns an error

`get_transcription_job(job_id)` → `TranscriptionJob`

Get the status of a transcription job.

Parameters:

job_id (str): Job ID from transcribe_file()

Returns: TranscriptionJob - Updated job status

`get_transcription_result(job_id)` → `TranscriptionResult`

Get the transcript once the job is completed.

Parameters:

job_id (str): Job ID from completed transcription

Returns: TranscriptionResult - Transcript with tokens

Raises:

SonioxAPIError: If job is not completed or not found

`transcribe_file_async(file_path, config=None, **kwargs)` → `TranscriptionJob`

Async version of transcribe_file().

`get_transcription_job_async(job_id)` → `TranscriptionJob`

Async version of get_transcription_job().

`get_transcription_result_async(job_id)` → `TranscriptionResult`

Async version of get_transcription_result().

Models

TranscriptionJob

Transcription job status and metadata.

Fields:

id (str): Job ID (UUID)
status (TranscriptionJobStatus): Job status ("queued", "processing", "completed", "error")
created_at (datetime): Job creation timestamp
filename (str): Original filename
file_id (str | None): Uploaded file ID
audio_url (str | None): Audio URL if provided
audio_duration_ms (int | None): Audio duration in milliseconds
error_message (str | None): Error message if failed
All configuration fields from TranscriptionConfig

TranscriptionResult

Transcription result with full transcript.

Fields:

id (str): Transcript ID (matches job ID)
text (str): Full transcribed text
tokens (list[Token]): Word-level tokens with timing

Token

Word-level transcription token.

Fields:

text (str): Token text
start_ms (int): Start time in milliseconds
end_ms (int): End time in milliseconds
confidence (float): Confidence score (0-1)
speaker (str | None): Speaker ID if diarization enabled

TranscriptionConfig

Configuration for transcription jobs.

Fields:

model (str): Model to use (default: "stt-async-preview")
language_hints (list[Language] | None): Language hints
enable_language_identification (bool): Enable language detection
enable_speaker_diarization (bool): Enable speaker diarization
context (str | None): Context for improved accuracy
client_reference_id (str | None): Your reference ID
webhook_url (str | None): Webhook URL
webhook_auth_header_name (str | None): Webhook auth header name
webhook_auth_header_value (str | None): Webhook auth header value

FileUploadResponse

Response from file upload.

Fields:

id (str): File ID
filename (str): Original filename
size (int): File size in bytes
created_at (datetime): Upload timestamp
client_reference_id (str | None): Your reference ID

Exceptions

SonioxError: Base exception for all Soniox errors
SonioxAuthenticationError: Raised when authentication fails
SonioxAPIError: Raised when API returns an error response
SonioxRateLimitError: Raised when rate limit is exceeded

Error Handling

import time
from soniox import SonioxClient
from soniox.exceptions import (
    SonioxAPIError,
    SonioxAuthenticationError,
    SonioxRateLimitError,
)

client = SonioxClient()

try:
    # Submit transcription
    job = client.transcribe_file("audio.wav")
    
    # Wait for completion
    while True:
        job = client.get_transcription_job(job.id)
        if job.status == "completed":
            break
        elif job.status == "error":
            print(f"Transcription failed: {job.error_message}")
            break
        time.sleep(1)
    
    # Get result
    if job.status == "completed":
        result = client.get_transcription_result(job.id)
        print(result.text)

except FileNotFoundError:
    print("Audio file not found")
except SonioxAuthenticationError as e:
    print(f"Authentication failed: {e}")
except SonioxRateLimitError as e:
    print(f"Rate limit exceeded: {e}")
    print(f"Status code: {e.status_code}")
except SonioxAPIError as e:
    print(f"API error: {e}")
    print(f"Status code: {e.status_code}")
    print(f"Response: {e.response_body}")

Testing

Run tests with pytest:

# Install development dependencies
pip install -e ".[dev,test]"

# Run all tests
pytest

# Run with coverage
pytest --cov=src --cov-report=html --cov-report=term-missing

# Run specific test file
pytest tests/test_models.py

# Run with verbose output
pytest -v

See tests/README.md for more details on the test suite.

Development

# Clone the repository
git clone https://github.com/mahdikiani/soniox-sdk.git
cd soniox

# Install in editable mode with dev dependencies
pip install -e ".[dev,test]"

# Run linter
ruff check src/

# Run type checker
mypy src/

Resources

License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

Support

📧 Email: mahdikiany@gmail.com
🐛 Issues: GitHub Issues
💬 Discussions: GitHub Discussions

Changelog

See CHANGELOG.md for version history and updates.

Made with ❤️ by Mahdi Kiani

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.6

Oct 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soniox_sdk-0.1.6.tar.gz (21.4 kB view details)

Uploaded Oct 2, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

soniox_sdk-0.1.6-py3-none-any.whl (13.0 kB view details)

Uploaded Oct 2, 2025 Python 3

File details

Details for the file soniox_sdk-0.1.6.tar.gz.

File metadata

Download URL: soniox_sdk-0.1.6.tar.gz
Upload date: Oct 2, 2025
Size: 21.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soniox_sdk-0.1.6.tar.gz
Algorithm	Hash digest
SHA256	`85770cc0c154a6e5259de2b8354486ff25c09593f5a62a5ce2b7ae401c75bfa2`
MD5	`c8a4ba753117cf5fab41cf3ea86bbc73`
BLAKE2b-256	`069def7f451df1d3f3a71963a5dbf17eecb9bd7104a9b8184f4b0a92e79e7d24`

See more details on using hashes here.

File details

Details for the file soniox_sdk-0.1.6-py3-none-any.whl.

File metadata

Download URL: soniox_sdk-0.1.6-py3-none-any.whl
Upload date: Oct 2, 2025
Size: 13.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for soniox_sdk-0.1.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`6b2913735eb6489b6b23813f07a86a4438bbba4fe0ad9a70209be73fc7afb5aa`
MD5	`7110eb3d39c6a98e54d2281dfb37c56b`
BLAKE2b-256	`794e77bf69889070c322119c7aa9605eaf5296aeefabd01ace4c62c923eef723`

See more details on using hashes here.

soniox-sdk 0.1.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Soniox Python SDK

Features

Installation

Quick Start

Authentication

Basic Usage

Transcribe an Audio File

Transcribe with Custom Configuration

Advanced Features

Speaker Diarization

Language Identification

Context for Improved Accuracy

Configuration

Client Options

Logging

API Reference

SonioxClient

Methods

transcribe_file(file_path, config=None, **kwargs) → TranscriptionJob

get_transcription_job(job_id) → TranscriptionJob

get_transcription_result(job_id) → TranscriptionResult

transcribe_file_async(file_path, config=None, **kwargs) → TranscriptionJob

get_transcription_job_async(job_id) → TranscriptionJob

get_transcription_result_async(job_id) → TranscriptionResult

Models

TranscriptionJob

TranscriptionResult

Token

TranscriptionConfig

FileUploadResponse

Exceptions

Error Handling

Testing

Development

Resources

License

Support

Changelog

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`transcribe_file(file_path, config=None, **kwargs)` → `TranscriptionJob`

`get_transcription_job(job_id)` → `TranscriptionJob`

`get_transcription_result(job_id)` → `TranscriptionResult`

`transcribe_file_async(file_path, config=None, **kwargs)` → `TranscriptionJob`

`get_transcription_job_async(job_id)` → `TranscriptionJob`

`get_transcription_result_async(job_id)` → `TranscriptionResult`