Moonshine STT integration for Vision Agents

These details have not been verified by PyPI

Project links

Project description

Moonshine STT Plugin

This plugin provides Speech-to-Text functionality using Moonshine, a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.

Features

Fast and Accurate: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER
Resource Efficient: Optimized for edge devices and resource-constrained environments
Variable Length Processing: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)
Multiple Models: Support for both moonshine/tiny (~190MB) and moonshine/base (~400MB) models
Device Flexibility: ONNX runtime automatically selects optimal execution provider
Smart Sample Rate Handling: Automatic detection and high-quality resampling of WebRTC audio (48kHz → 16kHz)
WebRTC Optimized: Seamless integration with Stream video calling infrastructure
Efficient Model Loading: ONNX version loads models on-demand for optimal memory usage

Installation

From PyPI + GitHub (Required)

Since the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:

# 1. Install the core plugin from PyPI
pip install getstream-plugins-moonshine

# 2. Install the moonshine model dependency from GitHub
pip install "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"

With uv

# Install both dependencies
uv add getstream-plugins-moonshine
uv add "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"

Development Installation (uv)

If your project uses uv, add both dependencies to your pyproject.toml:

[project]
dependencies = [
    # … other deps …
    "getstream-plugins-moonshine>=0.1.0",
    "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx",
]

[tool.uv.sources]
getstream-plugins-moonshine = { path = "getstream/plugins/moonshine" }  # for local development

Then:

uv sync        # installs both dependencies

Usage

from getstream.plugins.moonshine import MoonshineSTT
from getstream.video.rtc.track_util import PcmData

# Initialize with default settings (base model, 16kHz)
stt = MoonshineSTT()

# Or customize the configuration
stt = MoonshineSTT(
    model_name="moonshine/tiny",  # Use the smaller, faster model
    sample_rate=16000,            # Moonshine's native sample rate
    min_audio_length_ms=500,      # Minimum audio length for transcription
    # ONNX runtime will automatically select the best execution provider
)

# Set up event handlers
@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
    print(f"Final transcript: {text}")
    print(f"Confidence: {metadata.get('confidence', 'N/A')}")
    print(f"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms")

@stt.on("error")
async def on_error(error: Exception):
    print(f"STT Error: {error}")

# Process audio data
pcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await stt.process_audio(pcm_data)

# Clean up
await stt.close()

Model Selection

Moonshine offers two model variants with different trade-offs:

Model	Size	Parameters	Speed	Accuracy	Use Case
`moonshine/tiny`	~190MB	27M	Faster	Good	Resource-constrained devices, real-time applications
`moonshine/base`	~400MB	61M	Fast	Better	Default choice - balanced performance and accuracy

Default Model: The plugin uses moonshine/base by default as it provides the best balance of accuracy and performance for most use cases.

Choosing a Model:

Use moonshine/tiny for maximum speed on very resource-constrained devices
Use moonshine/base for better accuracy with still excellent performance (recommended)

Model Name Validation:

Strict validation prevents silent fallbacks to wrong models
Supports both short names ("tiny", "base") and full names ("moonshine/tiny", "moonshine/base")
Clear error messages list all valid options when invalid model is specified
Canonical model names ensure consistent behavior across different input formats

Sample Rate Handling

The Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:

Events

The plugin emits the following events:

transcript: Final transcription result
- text (str): The transcribed text
- user (any): User metadata passed to process_audio()
- metadata (dict): Additional information including model name, duration, etc.
error: Error during transcription
- error (Exception): The error that occurred

Note: Unlike streaming STT services, Moonshine doesn't emit partial_transcript events as it processes complete audio chunks.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.11

Oct 28, 2025

0.1.9

Oct 22, 2025

0.1.8

Oct 22, 2025

0.1.7

Oct 21, 2025

0.1.6

Oct 16, 2025

0.1.5

Oct 9, 2025

0.1.3

Oct 9, 2025

This version

0.1.0

Oct 9, 2025

0.0.18

Oct 8, 2025

0.0.17

Oct 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vision_agents_plugins_moonshine-0.1.0.tar.gz (7.2 kB view details)

Uploaded Oct 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vision_agents_plugins_moonshine-0.1.0-py3-none-any.whl (14.0 kB view details)

Uploaded Oct 9, 2025 Python 3

File details

Details for the file vision_agents_plugins_moonshine-0.1.0.tar.gz.

File metadata

Download URL: vision_agents_plugins_moonshine-0.1.0.tar.gz
Upload date: Oct 9, 2025
Size: 7.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.19

File hashes

Hashes for vision_agents_plugins_moonshine-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6a052a5ee11db408b8e288830b2c5c5d6cfd7e566c0fbbad739d9d7f4d35d535`
MD5	`6300af380c3259a27f88cb2e8e137d0c`
BLAKE2b-256	`75192503238ca170ea66c760b5908e9ef02b3ab15053c58bf55e103f442cbf03`

See more details on using hashes here.

File details

Details for the file vision_agents_plugins_moonshine-0.1.0-py3-none-any.whl.

File metadata

Download URL: vision_agents_plugins_moonshine-0.1.0-py3-none-any.whl
Upload date: Oct 9, 2025
Size: 14.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.19

File hashes

Hashes for vision_agents_plugins_moonshine-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a3122352434fd4cfa4b4ae52f4860750024bbbaa527bfe933318cd8e855947db`
MD5	`b185ea448ce916a859c7a27f444c76cf`
BLAKE2b-256	`50ffd7a42274809398a34e51a5363abac3eccfa504ef03d080f5960766e027eb`

See more details on using hashes here.

vision-agents-plugins-moonshine 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Moonshine STT Plugin

Features

Installation

From PyPI + GitHub (Required)

With uv

Development Installation (uv)

Usage

Model Selection

Sample Rate Handling

Events

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes