Python SDK for Speechall API - Speech-to-text transcription service
Project description
Speechall Python SDK
Python SDK for the Speechall API - A powerful speech-to-text transcription service supporting multiple AI models and providers.
Features
- Multiple AI Models: Access various speech-to-text models from different providers (OpenAI Whisper, and more)
- Flexible Input: Transcribe local audio files or remote URLs
- Rich Output Formats: Get results in text, JSON, SRT, or VTT formats
- Speaker Diarization: Identify and separate different speakers in audio
- Custom Vocabulary: Improve accuracy with domain-specific terms
- Replacement Rules: Apply custom text transformations to transcriptions
- Language Support: Auto-detect languages or specify from a wide range of supported languages
- Async Support: Built with async/await support using httpx
Installation
pip install speechall
Quick Start
Basic Transcription
import os
from speechall import SpeechallApi
# Initialize the client
client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))
# Transcribe a local audio file
with open("audio.mp3", "rb") as audio_file:
audio_data = audio_file.read()
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
punctuation=True
)
print(response.text)
Transcribe Remote Audio
from speechall import SpeechallApi
client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))
response = client.speech_to_text.transcribe_remote(
file_url="https://example.com/audio.mp3",
model="openai.whisper-1",
language="auto", # Auto-detect language
output_format="json"
)
print(response.text)
Advanced Features
Speaker Diarization
Identify different speakers in your audio:
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
diarization=True,
speakers_expected=2
)
for segment in response.segments:
print(f"[Speaker {segment.speaker}] {segment.text}")
Custom Vocabulary
Improve accuracy for specific terms:
response = client.speech_to_text.transcribe(
model="openai.whisper-1",
request=audio_data,
language="en",
output_format="json",
custom_vocabulary=["Kubernetes", "API", "Docker", "microservices"]
)
Replacement Rules
Apply custom text transformations:
from speechall import ReplacementRule, ExactRule
replacement_rules = [
ReplacementRule(
rule=ExactRule(find="API", replace="Application Programming Interface")
)
]
response = client.speech_to_text.transcribe_remote(
file_url="https://example.com/audio.mp3",
model="openai.whisper-1",
language="en",
output_format="json",
replacement_ruleset=replacement_rules
)
List Available Models
models = client.speech_to_text.list_speech_to_text_models()
for model in models:
print(f"{model.model_identifier}: {model.display_name}")
print(f" Provider: {model.provider}")
Configuration
Authentication
Get your API token from speechall.com and set it as an environment variable:
export SPEECHALL_API_TOKEN="your-token-here"
Or pass it directly when initializing the client:
from speechall import SpeechallApi
client = SpeechallApi(token="your-token-here")
Output Formats
text: Plain text transcriptionjson: JSON with detailed information (segments, timestamps, metadata)json_text: JSON with simplified text outputsrt: SubRip subtitle formatvtt: WebVTT subtitle format
Language Codes
Use ISO 639-1 language codes (e.g., en, es, fr, de) or auto for automatic detection.
API Reference
Client Classes
SpeechallApi: Main client for the Speechall APIAsyncSpeechallApi: Async client for the Speechall API
Main Methods
speech_to_text.transcribe()
Transcribe a local audio file.
Parameters:
model(str): Model identifier (e.g., "openai.whisper-1")request(bytes): Audio file contentlanguage(str): Language code or "auto"output_format(str): Output format (text, json, srt, vtt)punctuation(bool): Enable automatic punctuationdiarization(bool): Enable speaker identificationspeakers_expected(int, optional): Expected number of speakerscustom_vocabulary(list, optional): List of custom termsinitial_prompt(str, optional): Context prompt for the modeltemperature(float, optional): Model temperature (0.0-1.0)
speech_to_text.transcribe_remote()
Transcribe audio from a URL.
Parameters: Same as transcribe() but with file_url instead of request
speech_to_text.list_speech_to_text_models()
List all available models.
Examples
Check out the examples directory for more detailed usage examples:
- transcribe_local_file.py - Transcribe local audio files
- transcribe_remote_file.py - Transcribe remote audio URLs
Requirements
- Python 3.8+
- httpx >= 0.27.0
- pydantic >= 2.0.0
- typing-extensions >= 4.0.0
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Type checking
mypy .
Support
- Documentation: docs.speechall.com
- GitHub: github.com/speechall/speechall-python-sdk
- Issues: github.com/speechall/speechall-python-sdk/issues
License
MIT License - see LICENSE file for details
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file speechall-0.3.0.tar.gz.
File metadata
- Download URL: speechall-0.3.0.tar.gz
- Upload date:
- Size: 32.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
67ebd024c26fa6a7bbcc0fd07437be65461e35c79e658cf8b783847f8dcd28e3
|
|
| MD5 |
1686884f1ae7ce580fc5bd294407108c
|
|
| BLAKE2b-256 |
36a7fb738d44fc7bea86011b0687e35e0d175f7d0f8c8129d4a7301d132b92ac
|
File details
Details for the file speechall-0.3.0-py3-none-any.whl.
File metadata
- Download URL: speechall-0.3.0-py3-none-any.whl
- Upload date:
- Size: 56.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f1f46f3c221ea3b316365d04885992458c263ea213183675047e5fe13d80da23
|
|
| MD5 |
0492469359202221065423cf75d3b2fc
|
|
| BLAKE2b-256 |
0a0c59cde247e7c379829c149f36104efbe7e89e86eb820d473248d33df0a13e
|