AssemblyAI streaming STT integration for Vision Agents
Project description
AssemblyAI Plugin
Streaming Speech-to-Text (STT) plugin for Vision Agents using AssemblyAI's Universal-3 Pro model.
Features
- Real-time streaming transcription via async WebSocket
- Built-in punctuation-based turn detection with configurable silence thresholds
- Streaming diarization — identify speakers in real time
- Native
SpeechStartedevent support - Custom prompt and keyterms boosting support
- Sub-300ms time to complete transcript latency
- Built-in reconnection with exponential backoff
Installation
uv add "vision-agents[assemblyai]"
# or directly
uv add vision-agents-plugins-assemblyai
Usage
from vision_agents.plugins import assemblyai
stt = assemblyai.STT(
speech_model="u3-rt-pro", # Default model
sample_rate=16000,
)
With streaming diarization
Enable speaker_labels to identify speakers in a mixed audio stream. Each transcript event will carry a distinct participant per speaker and the raw label in response.other["speaker_label"].
stt = assemblyai.STT(
speaker_labels=True,
max_speakers=2, # optional hint (1-10)
)
With keyterms boosting
stt = assemblyai.STT(
keyterms_prompt=["AssemblyAI", "Vision Agents"],
)
With custom turn silence thresholds
stt = assemblyai.STT(
min_turn_silence=100, # ms before speculative EOT check
max_turn_silence=1200, # ms before forcing turn end
)
Configuration
| Parameter | Description | Default |
|---|---|---|
api_key |
AssemblyAI API key (falls back to ASSEMBLYAI_API_KEY env var) |
None |
speech_model |
Model identifier | "u3-rt-pro" |
sample_rate |
Audio sample rate in Hz | 16000 |
min_turn_silence |
Silence (ms) before speculative end-of-turn check | API default |
max_turn_silence |
Maximum silence (ms) before forcing turn end | API default |
prompt |
Custom transcription prompt (cannot be combined with keyterms_prompt) |
None |
keyterms_prompt |
List of terms to boost recognition for (cannot be combined with prompt) |
None |
speaker_labels |
Enable streaming diarization for multi-speaker identification | False |
max_speakers |
Hint for expected number of speakers, 1-10 (requires speaker_labels=True) |
None |
max_reconnect_attempts |
Maximum reconnect attempts on transient failures | 3 |
reconnect_backoff_initial_s |
Initial backoff delay in seconds | 0.5 |
reconnect_backoff_max_s |
Maximum backoff delay in seconds | 4.0 |
Environment Variables
Set ASSEMBLYAI_API_KEY in your environment or pass api_key to the constructor.
Dependencies
aiohttp>=3.9.0vision-agents
Docs
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vision_agents_plugins_assemblyai-0.5.5.tar.gz.
File metadata
- Download URL: vision_agents_plugins_assemblyai-0.5.5.tar.gz
- Upload date:
- Size: 6.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9eafb2460b6f3553befa242cc311e558b7bf74ccb930b0371f5e9bea5e3ed1df
|
|
| MD5 |
877d511f3d222621f6c2c2e524f2b6eb
|
|
| BLAKE2b-256 |
af61f4ecc3d1328d0d5006c8ef9c3856a5a290fd4e74707893e98cc0b9ac9982
|
File details
Details for the file vision_agents_plugins_assemblyai-0.5.5-py3-none-any.whl.
File metadata
- Download URL: vision_agents_plugins_assemblyai-0.5.5-py3-none-any.whl
- Upload date:
- Size: 15.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.10 {"installer":{"name":"uv","version":"0.10.10","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4bcb25b79a81cc5038e8677688071a463a30ad0ae12f9cd0f47f2a48d72d6719
|
|
| MD5 |
cfa8b2c458056889004b7ce1fcb35437
|
|
| BLAKE2b-256 |
37068633b619632bfd524e362b631b66f5d7de337566c4603b70ecf5b9dedb80
|