Speechmatics Voice Agent Python client for Real-Time API

These details have not been verified by PyPI

Project links

Project description

Speechmatics Voice SDK

Python SDK for building voice-enabled applications using Speechmatics Real-Time API. Optimized for specific use cases: conversational AI, voice agents, transcription services, and real-time captioning.

What is the Voice SDK?
Installation
Quick Start
Configuration
Event Messages
Common Usage Patterns
Environment Variables
Examples
SDK Class Reference
Requirements
Documentation
License

What is the Voice SDK?

The Voice SDK is a higher-level abstraction built on top of the Speechmatics Real-Time API (speechmatics-rt). While the Real-Time API provides raw transcription events (words and utterances), the Voice SDK adds:

Intelligent Segmentation - Groups words into meaningful speech segments per speaker
Turn Detection - Automatically detects when speakers finish their turns using adaptive or ML-based methods
Speaker Management - Focus on or ignore specific speakers in multi-speaker scenarios
Preset Configurations - Ready-to-use configs for common use cases (conversation, note-taking, captions)
Simplified Event Handling - Receive clean, structured segments instead of raw word-level events

When to Use Voice SDK vs Real-Time API

Use Voice SDK when:

You are building conversational AI or voice agents
You need automatic turn detection
You want speaker-focused transcription
You need ready-to-use presets for common scenarios

Use Real-Time API when:

You only need raw, word-level events
You are building custom segmentation / aggregation logic
You want fine-grained control over every event

Installation

# Standard installation
pip install speechmatics-voice

# With VAD and SMART_TURN (ML-based turn detection)
pip install speechmatics-voice[smart]

Note: Some features require additional ML dependencies (ONNX runtime, transformers). If not installed, these features will be unavailable and a warning will be shown.

👉 Using Docker? Click to see how to install the required models.

Use within Docker

If you are using a Docker container with the Voice SDK installed and you require the smart features (SMART_TURN), then you can use the following in your Dockerfile to make sure the ML models are included and not downloaded at runtime.

"""
Download the Voice SDK required models during the build process.
"""

from speechmatics.voice import SileroVAD, SmartTurnDetector


def load_models():
    SileroVAD.download_model()
    SmartTurnDetector.download_model()


if __name__ == "__main__":
    load_models()

Then, in your Dockerfile, include the following:

COPY ./models.py models.py
RUN uv run models.py

This copies the script and runs it as part of the build.

Quick Start

Basic Example

A simple example that shows complete sentences as they have been finalized, with different speakers shown with different IDs.

import asyncio
import os
from speechmatics.rt import Microphone
from speechmatics.voice import VoiceAgentClient, AgentServerMessageType

async def main():
    """Stream microphone audio to Speechmatics Voice Agent using 'scribe' preset"""

    # Audio configuration
    SAMPLE_RATE = 16000         # Hz
    CHUNK_SIZE = 160            # Samples per read
    PRESET = "scribe"           # Configuration preset

    # Create client with preset
    client = VoiceAgentClient(
        api_key=os.getenv("SPEECHMATICS_API_KEY"),
        preset=PRESET
    )

    # Print finalised segments of speech with speaker ID
    @client.on(AgentServerMessageType.ADD_SEGMENT)
    def on_segment(message):
        for segment in message["segments"]:
            speaker = segment["speaker_id"]
            text = segment["text"]
            print(f"{speaker}: {text}")

    # Setup microphone
    mic = Microphone(SAMPLE_RATE, CHUNK_SIZE)
    if not mic.start():
        print("Error: Microphone not available")
        return

    # Connect to the Voice Agent
    await client.connect()

    # Stream microphone audio (interruptable using keyboard)
    try:
        while True:
            audio_chunk = await mic.read(CHUNK_SIZE)
            if not audio_chunk:
                break # Microphone stopped producing data
            await client.send_audio(audio_chunk)
    except KeyboardInterrupt:
        pass
    finally:
        await client.disconnect()

if __name__ == "__main__":
    asyncio.run(main())

Configuring a Voice Agent Client

When creating a VoiceAgentClient, there are several ways to configure it:

Presets - optimised configurations for common use cases. These require no further configuration to be set.

# Low latency preset - for fast responses (may split speech in to smaller segments)
client = VoiceAgentClient(api_key=api_key, preset="fast")

# Conversation preset - for natural dialogue
client = VoiceAgentClient(api_key=api_key, preset="adaptive")

# Advanced conversation with ML turn detection
client = VoiceAgentClient(api_key=api_key, preset="smart_turn")

# External end of turn preset - endpointing handled by the client
client = VoiceAgentClient(api_key=api_key, preset="external")

# Scribe preset - for note-taking
client = VoiceAgentClient(api_key=api_key, preset="scribe")

# Captions preset - for live captioning
client = VoiceAgentClient(api_key=api_key, preset="captions")

# To view all available presets, use:
presets = VoiceAgentConfigPreset.list_presets()

Custom Configuration - for more control, you can also specify custom configuration in a VoiceAgentConfig object.

from speechmatics.voice import VoiceAgentClient, VoiceAgentConfig, EndOfUtteranceMode

# Define your custom configuration
config = VoiceAgentConfig(
    language="en",
    enable_diarization=True,
    max_delay=0.7,
    end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,
)

client = VoiceAgentClient(api_key=api_key, config=config)

Custom Configuration with Overlays - you can use presets as a starting point, and then customize with overlays.

from speechmatics.voice import VoiceAgentConfigPreset, VoiceAgentConfig

# Use preset with custom overrides
config = VoiceAgentConfigPreset.SCRIBE(
    VoiceAgentConfig(
        language="es",
        max_delay=0.8
    )
)

Note: If no config or preset is provided, the client will default to the external preset.

Configuration Serialization

It can also be useful to export and import configuration as JSON:

from speechmatics.voice import VoiceAgentConfigPreset, VoiceAgentConfig

# Export preset to JSON
config_json = VoiceAgentConfigPreset.SCRIBE().to_json()

# Load from JSON
config = VoiceAgentConfig.from_json(config_json)

# Or create from JSON string
config = VoiceAgentConfig.from_json('{"language": "en", "enable_diarization": true}')

Configuration

Basic Parameters

Parameter	Type	Default	Description
`language`	str	`"en"`	Language code for transcription (e.g., `"en"`, `"es"`, `"fr"`). See supported languages.
`operating_point`	OperatingPoint	`ENHANCED`	Balance accuracy vs latency. Options: `STANDARD` or `ENHANCED`.
`domain`	str	`None`	Domain-specific model (e.g., `"finance"`, `"medical"`). See supported languages and domains.
`output_locale`	str	`None`	Output locale for formatting (e.g., `"en-GB"`, `"en-US"`). See supported languages and locales.
`max_delay`	float	`0.7`	Maximum transcription delay for word emission.

Turn Detection Parameters

Parameter	Type	Default	Description
`end_of_utterance_mode`	EndOfUtteranceMode	`FIXED`	Controls how turn endings are detected. Options: - `FIXED` - Uses fixed silence threshold. Fast but may split slow speech. - `ADAPTIVE` - Adjusts delay based on speech rate, pauses, and disfluencies. Best for natural conversation. - `EXTERNAL` - Manual control via `client.finalize()`. For custom turn logic.
`end_of_utterance_silence_trigger`	float	`0.2`	Silence duration in seconds to trigger turn end (also used for the basis of adaptive delay).

Speaker Configuration

Parameter	Type	Default	Description
`enable_diarization`	bool	`False`	Enable speaker diarization to identify and label different speakers.
`speaker_sensitivity`	float	`0.5`	Diarization sensitivity between 0.0 and 1.0. Higher values detect more speakers.
`max_speakers`	int	`None`	Limit maximum number of speakers to detect.
`prefer_current_speaker`	bool	`False`	Give extra weight to current speaker for word grouping.
`speaker_config`	SpeakerFocusConfig	`SpeakerFocusConfig()`	Configure speaker focus/ignore rules.
`known_speakers`	list[SpeakerIdentifier]	`[]`	Pre-enrolled speaker identifiers for speaker identification.

Usage Examples

Using speaker_config, you can focus on only specific speakers but keep words from others, or ignore specific speakers.

from speechmatics.voice import SpeakerFocusConfig, SpeakerFocusMode

# Focus only on specific speakers, but keep words from other speakers
config = VoiceAgentConfig(
    enable_diarization=True,
    speaker_config=SpeakerFocusConfig(
        focus_speakers=["S1", "S2"],
        focus_mode=SpeakerFocusMode.RETAIN
    )
)

# Ignore specific speakers
config = VoiceAgentConfig(
    enable_diarization=True,
    speaker_config=SpeakerFocusConfig(
        ignore_speakers=["S3"]
    )
)

Using known_speakers, you can use pre-enrolled speaker identifiers to identify specific speakers.

from speechmatics.voice import SpeakerIdentifier

# Use known speakers from previous session
config = VoiceAgentConfig(
    enable_diarization=True,
    known_speakers=[
        SpeakerIdentifier(label="Alice", speaker_identifiers=["XX...XX"]),
        SpeakerIdentifier(label="Bob", speaker_identifiers=["YY...YY"])
    ]
)

Language & Vocabulary

Parameter	Type	Default	Description
`additional_vocab`	list[AdditionalVocabEntry]	`[]`	Custom vocabulary for domain-specific terms.
`punctuation_overrides`	dict	`None`	Custom punctuation rules.

Usage Examples

Using additional_vocab, you can specify a dictionary of domain-specific terms.

from speechmatics.voice import AdditionalVocabEntry

config = VoiceAgentConfig(
    language="en",
    additional_vocab=[
        AdditionalVocabEntry(
            content="Speechmatics",
            sounds_like=["speech matters", "speech matics"]
        ),
        AdditionalVocabEntry(content="API"),
    ]
)

Audio Parameters

Parameter	Type	Default	Description
`sample_rate`	int	`16000`	Audio sample rate in Hz.
`audio_encoding`	AudioEncoding	`PCM_S16LE`	Audio encoding format.

Advanced Parameters

Parameter	Type	Default	Description
`transcription_update_preset`	TranscriptionUpdatePreset	`COMPLETE`	Controls when to emit updates: `COMPLETE`, `COMPLETE_PLUS_TIMING`, `WORDS`, `WORDS_PLUS_TIMING`, or `TIMING`.
`speech_segment_config`	SpeechSegmentConfig	`SpeechSegmentConfig()`	Fine-tune segment generation and post-processing.
`smart_turn_config`	SmartTurnConfig	`None`	Configure SMART_TURN behavior (buffer length, threshold).
`include_results`	bool	`False`	Include word-level timing data in segments.
`include_partials`	bool	`True`	Include interim (lower confidence) words in emitted segments. Set to `False` for final-only output.

Event Messages

The Voice SDK emits real-time, structured events as a session progresses via AgentServerMessageType.

These events fall into three main categories:

Core Events - high-level session and transcription updates.
Speaker Events - detected speech activity.
Additional - detailed, low-level events.

To handle events, register a callback using @client.on() decorator or client.on() method.

Note: The payloads shown below are the actual message payloads from the Voice SDK. When using the CLI example with --output-file, messages also include a ts timestamp field (e.g., "ts": "2025-11-11 23:18:35.909"), which is added by the CLI for logging purposes and is not part of the SDK payload.

High Level Overview

Core Events

Event	Description	Notes / Purpose
`RECOGNITION_STARTED`	Fired when a transcription session starts	Contains session ID, language pack info
`ADD_PARTIAL_SEGMENT`	Emitted continuously during speech	Provides interim, real-time transcription text
`ADD_SEGMENT`	Fired when a segment is finalized	Provides stable, final transcription text
`END_OF_TURN`	Fired when a speaker’s turn ends	Depends on `end_of_utterance_mode`; useful for turn tracking

Speaker Events

Event	When it fires	Purpose
`SPEAKER_STARTED`	Voice detected	Marks start of speech
`SPEAKER_ENDED`	Silence detected	Marks end of speech
`SPEAKERS_RESULT`	Enrollment completes	Provides speaker IDs and labels

Additional Events

Event	When it fires	Purpose
`START_OF_TURN`	New turn begins	Optional, low-level event for turn tracking
`END_OF_TURN_PREDICTION`	Predicts turn completion	Fires before END_OF_TURN in adaptive mode
`END_OF_UTTERANCE`	Silence threshold reached	Low-level STT engine trigger
`ADD_PARTIAL_TRANSCRIPT`	Word-level partial transcript	Legacy; use ADD_PARTIAL_SEGMENT instead
`ADD_TRANSCRIPT`	Word-level final transcript	Legacy; use ADD_SEGMENT instead

Core Events - Examples and Payloads

RECOGNITION_STARTED

@client.on(AgentServerMessageType.RECOGNITION_STARTED)
def on_started(message):
    session_id = message["id"]
    language = message["language_pack_info"]["language_description"]
    print(f"Session {session_id} started - Language: {language}")

Payload:

{
  "message": "RecognitionStarted",
  "id": "a8779b0b-a238-43de-8211-c70f5fcbe191",
  "orchestrator_version": "2025.08.29127+289170c022.HEAD",
  "language_pack_info": {
    "language_description": "English",
    "word_delimiter": " ",
    "writing_direction": "left-to-right",
    "itn": true,
    "adapted": false
  }
}

ADD_PARTIAL_SEGMENT

@client.on(AgentServerMessageType.ADD_PARTIAL_SEGMENT)
def on_partial(message):
    for segment in message["segments"]:
        print(f"[INTERIM] {segment['speaker_id']}: {segment['text']}")

Payload:

{
  "message": "AddPartialSegment",
  "segments": [
    {
      "speaker_id": "S1",
      "is_active": true,
      "timestamp": "2025-11-11T23:18:37.189+00:00",
      "language": "en",
      "text": "Welcome to",
      "metadata": {
        "start_time": 1.28,
        "end_time": 1.6
      }
    }
  ],
  "metadata": {
    "start_time": 1.28,
    "end_time": 1.6,
    "processing_time": 0.307
  }
}

Fields:

speaker_id - Speaker label (e.g., "S1", "S2")
is_active - true if speaker is in focus (based on speaker_config)
text - Current partial transcription text
metadata.start_time - Segment start time (seconds since session start)
metadata.end_time - Segment end time (seconds since session start)

Top-level metadata contains the same timing plus processing_time.

ADD_SEGMENT

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    for segment in message["segments"]:
        speaker = segment["speaker_id"]
        text = segment["text"]
        start = message["metadata"]["start_time"]
        print(f"[{start:.2f}s] {speaker}: {text}")

Payload:

{
  "message": "AddSegment",
  "segments": [
    {
      "speaker_id": "S1",
      "is_active": true,
      "timestamp": "2025-11-11T23:18:37.189+00:00",
      "language": "en",
      "text": "Welcome to Speechmatics.",
      "metadata": {
        "start_time": 1.28,
        "end_time": 8.04
      }
    }
  ],
  "metadata": {
    "start_time": 1.28,
    "end_time": 8.04,
    "processing_time": 0.187
  }
}

END_OF_TURN

@client.on(AgentServerMessageType.END_OF_TURN)
def on_turn_end(message):
    duration = message["metadata"]["end_time"] - message["metadata"]["start_time"]
    print(f"Turn ended (duration: {duration:.2f}s)")

Payload:

{
  "message": "EndOfTurn",
  "turn_id": 0,
  "metadata": {
    "start_time": 1.28,
    "end_time": 8.04
  }
}

Speaker Events - Examples and Payloads

SPEAKER_STARTED

@client.on(AgentServerMessageType.SPEAKER_STARTED)
def on_speaker_start(message):
    speaker = message["speaker_id"]
    time = message["time"]
    print(f"{speaker} started speaking at {time}s")

Payload:

{
  "message": "SpeakerStarted",
  "is_active": true,
  "speaker_id": "S1",
  "time": 1.28
}

SPEAKER_ENDED

@client.on(AgentServerMessageType.SPEAKER_ENDED)
def on_speaker_end(message):
    speaker = message["speaker_id"]
    time = message["time"]
    print(f"{speaker} stopped speaking at {time}s")

Payload:

{
  "message": "SpeakerEnded",
  "is_active": false,
  "speaker_id": "S1",
  "time": 2.64
}

SPEAKERS_RESULT

# Listen for the result
@client.on(AgentServerMessageType.SPEAKERS_RESULT)
def on_speakers(message):
    for speaker in message["speakers"]:
        print(f"Speaker {speaker['label']}: {speaker['speaker_identifiers']}")

# Request speaker IDs at end of session
await client.send_message({"message": AgentClientMessageType.GET_SPEAKERS, "final": True})

# Request speaker IDs now
await client.send_message({"message": AgentClientMessageType.GET_SPEAKERS})

Common Usage Patterns

Simple Transcription

client = VoiceAgentClient(api_key=api_key, preset="scribe")

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    for segment in message["segments"]:
        print(f"{segment['speaker_id']}: {segment['text']}")

Conversational AI with Turn Detection

config = VoiceAgentConfig(
    language="en",
    enable_diarization=True,
    end_of_utterance_mode=EndOfUtteranceMode.ADAPTIVE,
)

client = VoiceAgentClient(api_key=api_key, config=config)

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    user_text = message["segments"][0]["text"]
    # Process user input

@client.on(AgentServerMessageType.END_OF_TURN)
def on_turn_end(message):
    # User finished speaking - generate AI response
    pass

Live Captions with Timestamps

client = VoiceAgentClient(api_key=api_key, preset="captions")

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    start_time = message["metadata"]["start_time"]
    for segment in message["segments"]:
        print(f"[{start_time:.1f}s] {segment['text']}")

Speaker Identification

from speechmatics.voice import SpeakerIdentifier

# Use known speakers from previous session
known_speakers = [
    SpeakerIdentifier(label="Alice", speaker_identifiers=["XX...XX"]),
    SpeakerIdentifier(label="Bob", speaker_identifiers=["YY...YY"])
]

config = VoiceAgentConfig(
    enable_diarization=True,
    known_speakers=known_speakers
)

client = VoiceAgentClient(api_key=api_key, config=config)

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    for segment in message["segments"]:
        # Will show "Alice" or "Bob" instead of "S1", "S2"
        print(f"{segment['speaker_id']}: {segment['text']}")

Manual Turn Control

config = VoiceAgentConfig(
    end_of_utterance_mode=EndOfUtteranceMode.EXTERNAL
)

client = VoiceAgentClient(api_key=api_key, config=config)

# Manually trigger turn end
await client.finalize(end_of_turn=True)

Focus on Specific Speaker

from speechmatics.voice import SpeakerFocusConfig, SpeakerFocusMode

config = VoiceAgentConfig(
    enable_diarization=True,
    speaker_config=SpeakerFocusConfig(
        focus_speakers=["S1"],  # Only emit S1's speech
        focus_mode=SpeakerFocusMode.RETAIN
    )
)

client = VoiceAgentClient(api_key=api_key, config=config)

@client.on(AgentServerMessageType.ADD_SEGMENT)
def on_segment(message):
    # Only S1's segments will appear here
    for segment in message["segments"]:
        if segment["is_active"]:
            print(f"{segment['text']}")

# Dynamically change focused speaker during session
await client.update_diarization_config(
    SpeakerFocusConfig(
        focus_speakers=["S2"],  # Switch focus to S2
        focus_mode=SpeakerFocusMode.RETAIN
    )
)

Environment Variables

SPEECHMATICS_API_KEY - Your Speechmatics API key (required)
SPEECHMATICS_RT_URL - Custom WebSocket endpoint (optional)
SMART_TURN_MODEL_PATH - Path for SMART_TURN ONNX model cache (optional)
SMART_TURN_HF_URL - Override SMART_TURN model download URL (optional)

Examples

See the examples/voice/ directory for complete working examples:

simple/ - Basic microphone transcription
scribe/ - Note-taking with custom vocabulary
cli/ - Full-featured CLI with all options

SDK Class Reference

VoiceAgentClient

class VoiceAgentClient:
    def __init__(
        self,
        auth: Optional[AuthBase] = None,
        api_key: Optional[str] = None,
        url: Optional[str] = None,
        app: Optional[str] = None,
        config: Optional[VoiceAgentConfig] = None,
        preset: Optional[str] = None
    ):
        """Create Voice Agent client.

        Args:
            auth: Authentication instance (optional)
            api_key: Speechmatics API key (defaults to SPEECHMATICS_API_KEY env var)
            url: Custom WebSocket URL (defaults to SPEECHMATICS_RT_URL env var)
            app: Optional application name for endpoint URL
            config: Voice Agent configuration (optional)
            preset: Preset name ("scribe", "fast", etc.) (optional)
        """

    async def connect(self) -> None:
        """Connect to Speechmatics service.

        Establishes WebSocket connection and starts transcription session.
        Must be called before sending audio.
        """

    async def disconnect(self) -> None:
        """Disconnect from service.

        Closes WebSocket connection and cleans up resources.
        """

    async def send_audio(self, payload: bytes) -> None:
        """Send audio data for transcription.

        Args:
            payload: Audio data as bytes
        """

    def update_diarization_config(self, config: SpeakerFocusConfig) -> None:
        """Update diarization configuration during session.

        Args:
            config: New speaker focus configuration
        """

    def finalize(self, end_of_turn: bool = False) -> None:
        """Finalize segments and optionally trigger end of turn.

        Args:
            end_of_turn: Whether to emit end of turn message (default: False)
        """

    async def send_message(self, message: dict) -> None:
        """Send control message to service.

        Args:
            message: Control message dictionary
        """

    def on(self, event: AgentServerMessageType, callback: Callable) -> None:
        """Register event handler.

        Args:
            event: Event type to listen for
            callback: Function to call when event occurs
        """

    def once(self, event: AgentServerMessageType, callback: Callable) -> None:
        """Register one-time event handler.

        Args:
            event: Event type to listen for
            callback: Function to call once when event occurs
        """

    def off(self, event: AgentServerMessageType, callback: Callable) -> None:
        """Unregister event handler.

        Args:
            event: Event type
            callback: Function to remove
        """

Requirements

Python 3.9+
Speechmatics API key (Get one through: Speechmatics Portal)

Documentation

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.9rc3 pre-release

Apr 10, 2026

0.2.9rc1 pre-release

Apr 10, 2026

This version

0.2.8

Jan 26, 2026

0.2.7

Jan 12, 2026

0.2.6

Jan 8, 2026

0.2.5

Dec 22, 2025

0.2.4

Dec 17, 2025

0.2.3

Dec 16, 2025

0.2.2

Dec 12, 2025

0.2.1

Dec 5, 2025

0.2.0

Dec 5, 2025

0.1.27

Nov 13, 2025

0.1.26

Nov 12, 2025

0.1.25

Nov 9, 2025

0.1.24

Nov 1, 2025

0.1.23

Oct 30, 2025

0.1.22

Oct 29, 2025

0.1.21

Oct 29, 2025

0.1.20

Oct 28, 2025

0.1.19

Oct 27, 2025

0.1.18

Oct 24, 2025

0.1.17

Oct 17, 2025

0.1.16

Oct 15, 2025

0.1.15

Oct 15, 2025

0.1.14

Oct 13, 2025

0.1.13

Oct 9, 2025

0.1.12

Oct 7, 2025

0.1.11

Sep 26, 2025

0.1.10

Sep 25, 2025

0.1.9

Sep 25, 2025

0.1.8

Sep 24, 2025

0.1.7

Sep 24, 2025

0.1.6

Sep 22, 2025

0.1.5

Sep 15, 2025

0.1.4

Sep 15, 2025

0.1.3

Sep 14, 2025

0.1.2

Sep 10, 2025

0.1.0

Sep 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speechmatics_voice-0.2.8.tar.gz (61.4 kB view details)

Uploaded Jan 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

speechmatics_voice-0.2.8-py3-none-any.whl (57.6 kB view details)

Uploaded Jan 26, 2026 Python 3

File details

Details for the file speechmatics_voice-0.2.8.tar.gz.

File metadata

Download URL: speechmatics_voice-0.2.8.tar.gz
Upload date: Jan 26, 2026
Size: 61.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speechmatics_voice-0.2.8.tar.gz
Algorithm	Hash digest
SHA256	`b2d9cbf773fd94400c744734662e2b16b5bdc4271d0dafde46ac032c438fe000`
MD5	`abcd310172427b2ae68f426db1a3d00f`
BLAKE2b-256	`e4b272b5b2203bbefbd22e7692adaca0dd7c2feebed1aaea5599ec579f74fbbf`

See more details on using hashes here.

File details

Details for the file speechmatics_voice-0.2.8-py3-none-any.whl.

File metadata

Download URL: speechmatics_voice-0.2.8-py3-none-any.whl
Upload date: Jan 26, 2026
Size: 57.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speechmatics_voice-0.2.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`423ac7620ae8c98f175faace2184ac4ab1fe448ffb41af57aae05ec655326f79`
MD5	`319d95ecab098c8a746508295fa7bcf4`
BLAKE2b-256	`892da2ab215a7a31fad5ef9267420dc9ced96d6d52e5b80b131ef41424607849`

See more details on using hashes here.

speechmatics-voice 0.2.8

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Speechmatics Voice SDK

Table of Contents

What is the Voice SDK?

When to Use Voice SDK vs Real-Time API

Installation

Use within Docker

Quick Start

Basic Example

Configuring a Voice Agent Client

Configuration Serialization

Configuration

Basic Parameters

Turn Detection Parameters

Speaker Configuration

Usage Examples

Language & Vocabulary

Usage Examples

Audio Parameters

Advanced Parameters

Event Messages

High Level Overview

Core Events

Speaker Events

Additional Events

Core Events - Examples and Payloads

RECOGNITION_STARTED

ADD_PARTIAL_SEGMENT

ADD_SEGMENT

END_OF_TURN

Speaker Events - Examples and Payloads

SPEAKER_STARTED

SPEAKER_ENDED

SPEAKERS_RESULT

Common Usage Patterns

Simple Transcription

Conversational AI with Turn Detection

Live Captions with Timestamps

Speaker Identification

Manual Turn Control

Focus on Specific Speaker

Environment Variables

Examples

SDK Class Reference

VoiceAgentClient

Requirements

Documentation

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes