Python SDK for Speechall API - Speech-to-text transcription service

These details have not been verified by PyPI

Project links

Project description

Speechall Python SDK

Python SDK for the Speechall API - A powerful speech-to-text transcription service supporting multiple AI models and providers.

Features

Multiple AI Models: Access various speech-to-text models from different providers (OpenAI Whisper, and more)
Flexible Input: Transcribe local audio files or remote URLs
Rich Output Formats: Get results in text, JSON, SRT, or VTT formats
Speaker Diarization: Identify and separate different speakers in audio
Custom Vocabulary: Improve accuracy with domain-specific terms
Replacement Rules: Apply custom text transformations to transcriptions
Language Support: Auto-detect languages or specify from a wide range of supported languages
Async Support: Built with async/await support using httpx

Installation

pip install speechall

Quick Start

Basic Transcription

import os
from speechall import SpeechallApi

# Initialize the client
client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))

# Transcribe a local audio file
with open("audio.mp3", "rb") as audio_file:
    audio_data = audio_file.read()

response = client.speech_to_text.transcribe(
    model="openai.whisper-1",
    request=audio_data,
    language="en",
    output_format="json",
    punctuation=True
)

print(response.text)

Transcribe Remote Audio

from speechall import SpeechallApi

client = SpeechallApi(token=os.getenv("SPEECHALL_API_TOKEN"))

response = client.speech_to_text.transcribe_remote(
    file_url="https://example.com/audio.mp3",
    model="openai.whisper-1",
    language="auto",  # Auto-detect language
    output_format="json"
)

print(response.text)

Advanced Features

Speaker Diarization

Identify different speakers in your audio:

response = client.speech_to_text.transcribe(
    model="openai.whisper-1",
    request=audio_data,
    language="en",
    output_format="json",
    diarization=True,
    speakers_expected=2
)

for segment in response.segments:
    print(f"[Speaker {segment.speaker}] {segment.text}")

Custom Vocabulary

Improve accuracy for specific terms:

response = client.speech_to_text.transcribe(
    model="openai.whisper-1",
    request=audio_data,
    language="en",
    output_format="json",
    custom_vocabulary=["Kubernetes", "API", "Docker", "microservices"]
)

Replacement Rules

Apply custom text transformations:

from speechall import ReplacementRule, ExactRule

replacement_rules = [
    ReplacementRule(
        rule=ExactRule(find="API", replace="Application Programming Interface")
    )
]

response = client.speech_to_text.transcribe_remote(
    file_url="https://example.com/audio.mp3",
    model="openai.whisper-1",
    language="en",
    output_format="json",
    replacement_ruleset=replacement_rules
)

List Available Models

models = client.speech_to_text.list_speech_to_text_models()

for model in models:
    print(f"{model.model_identifier}: {model.display_name}")
    print(f"  Provider: {model.provider}")

Configuration

Authentication

Get your API token from speechall.com and set it as an environment variable:

export SPEECHALL_API_TOKEN="your-token-here"

Or pass it directly when initializing the client:

from speechall import SpeechallApi

client = SpeechallApi(token="your-token-here")

Output Formats

text: Plain text transcription
json: JSON with detailed information (segments, timestamps, metadata)
json_text: JSON with simplified text output
srt: SubRip subtitle format
vtt: WebVTT subtitle format

Language Codes

Use ISO 639-1 language codes (e.g., en, es, fr, de) or auto for automatic detection.

API Reference

Client Classes

SpeechallApi: Main client for the Speechall API
AsyncSpeechallApi: Async client for the Speechall API

Main Methods

`speech_to_text.transcribe()`

Transcribe a local audio file.

Parameters:

model (str): Model identifier (e.g., "openai.whisper-1")
request (bytes): Audio file content
language (str): Language code or "auto"
output_format (str): Output format (text, json, srt, vtt)
punctuation (bool): Enable automatic punctuation
diarization (bool): Enable speaker identification
speakers_expected (int, optional): Expected number of speakers
custom_vocabulary (list, optional): List of custom terms
initial_prompt (str, optional): Context prompt for the model
temperature (float, optional): Model temperature (0.0-1.0)

`speech_to_text.transcribe_remote()`

Transcribe audio from a URL.

Parameters: Same as transcribe() but with file_url instead of request

`speech_to_text.list_speech_to_text_models()`

List all available models.

Examples

Check out the examples directory for more detailed usage examples:

transcribe_local_file.py - Transcribe local audio files
transcribe_remote_file.py - Transcribe remote audio URLs

Requirements

Python 3.8+
httpx >= 0.27.0
pydantic >= 2.0.0
typing-extensions >= 4.0.0

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Type checking
mypy .

Support

Documentation: docs.speechall.com
GitHub: github.com/speechall/speechall-python-sdk
Issues: github.com/speechall/speechall-python-sdk/issues

License

MIT License - see LICENSE file for details

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.3.0

Dec 16, 2025

0.2.0

Jun 28, 2025

0.1.0

Jun 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechall-0.3.0.tar.gz (32.2 kB view details)

Uploaded Dec 16, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

speechall-0.3.0-py3-none-any.whl (56.4 kB view details)

Uploaded Dec 16, 2025 Python 3

File details

Details for the file speechall-0.3.0.tar.gz.

File metadata

Download URL: speechall-0.3.0.tar.gz
Upload date: Dec 16, 2025
Size: 32.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speechall-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`67ebd024c26fa6a7bbcc0fd07437be65461e35c79e658cf8b783847f8dcd28e3`
MD5	`1686884f1ae7ce580fc5bd294407108c`
BLAKE2b-256	`36a7fb738d44fc7bea86011b0687e35e0d175f7d0f8c8129d4a7301d132b92ac`

See more details on using hashes here.

File details

Details for the file speechall-0.3.0-py3-none-any.whl.

File metadata

Download URL: speechall-0.3.0-py3-none-any.whl
Upload date: Dec 16, 2025
Size: 56.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speechall-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f1f46f3c221ea3b316365d04885992458c263ea213183675047e5fe13d80da23`
MD5	`0492469359202221065423cf75d3b2fc`
BLAKE2b-256	`0a0c59cde247e7c379829c149f36104efbe7e89e86eb820d473248d33df0a13e`

See more details on using hashes here.

speechall 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Speechall Python SDK

Features

Installation

Quick Start

Basic Transcription

Transcribe Remote Audio

Advanced Features

Speaker Diarization

Custom Vocabulary

Replacement Rules

List Available Models

Configuration

Authentication

Output Formats

Language Codes

API Reference

Client Classes

Main Methods

speech_to_text.transcribe()

speech_to_text.transcribe_remote()

speech_to_text.list_speech_to_text_models()

Examples

Requirements

Development

Support

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`speech_to_text.transcribe()`

`speech_to_text.transcribe_remote()`

`speech_to_text.list_speech_to_text_models()`