Moonshine plugin for GetStream
Project description
Moonshine STT Plugin
This plugin provides Speech-to-Text functionality using Moonshine, a family of speech-to-text models optimized for fast and accurate automatic speech recognition (ASR) on resource-constrained devices.
Features
- Fast and Accurate: Moonshine processes 10-second audio segments 5x faster than Whisper while maintaining the same (or better!) WER
- Resource Efficient: Optimized for edge devices and resource-constrained environments
- Variable Length Processing: Compute requirements scale with input audio length (unlike Whisper's fixed 30-second chunks)
- Multiple Models: Support for both
moonshine/tiny(~190MB) andmoonshine/base(~400MB) models - Device Flexibility: ONNX runtime automatically selects optimal execution provider
- Smart Sample Rate Handling: Automatic detection and high-quality resampling of WebRTC audio (48kHz → 16kHz)
- WebRTC Optimized: Seamless integration with Stream video calling infrastructure
- Efficient Model Loading: ONNX version loads models on-demand for optimal memory usage
Installation
From PyPI + GitHub (Required)
Since the Moonshine ONNX models are not available on PyPI, you need to install them separately from GitHub:
# 1. Install the core plugin from PyPI
pip install getstream-plugins-moonshine
# 2. Install the moonshine model dependency from GitHub
pip install "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
With uv
# Install both dependencies
uv add getstream-plugins-moonshine
uv add "useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx"
Development Installation (uv)
If your project uses uv, add both dependencies to your pyproject.toml:
[project]
dependencies = [
# … other deps …
"getstream-plugins-moonshine",
"useful-moonshine-onnx @ git+https://github.com/usefulsensors/moonshine.git#subdirectory=moonshine-onnx",
]
[tool.uv.sources]
getstream-plugins-moonshine = { path = "getstream/plugins/moonshine" } # for local development
Then:
uv sync # installs both dependencies
Usage
from getstream.plugins.moonshine import MoonshineSTT
from getstream.video.rtc.track_util import PcmData
# Initialize with default settings (base model, 16kHz)
stt = MoonshineSTT()
# Or customize the configuration
stt = MoonshineSTT(
model_name="moonshine/tiny", # Use the smaller, faster model
sample_rate=16000, # Moonshine's native sample rate
min_audio_length_ms=500, # Minimum audio length for transcription
# ONNX runtime will automatically select the best execution provider
)
# Set up event handlers
@stt.on("transcript")
async def on_transcript(text: str, user: any, metadata: dict):
print(f"Final transcript: {text}")
print(f"Confidence: {metadata.get('confidence', 'N/A')}")
print(f"Processing time: {metadata.get('processing_time_ms', 'N/A')}ms")
@stt.on("error")
async def on_error(error: Exception):
print(f"STT Error: {error}")
# Process audio data
pcm_data = PcmData(samples=audio_bytes, sample_rate=16000, format="s16")
await stt.process_audio(pcm_data)
# Clean up
await stt.close()
Model Selection
Moonshine offers two model variants with different trade-offs:
| Model | Size | Parameters | Speed | Accuracy | Use Case |
|---|---|---|---|---|---|
moonshine/tiny |
~190MB | 27M | Faster | Good | Resource-constrained devices, real-time applications |
moonshine/base |
~400MB | 61M | Fast | Better | Default choice - balanced performance and accuracy |
Default Model: The plugin uses moonshine/base by default as it provides the best balance of accuracy and performance for most use cases.
Choosing a Model:
- Use
moonshine/tinyfor maximum speed on very resource-constrained devices - Use
moonshine/basefor better accuracy with still excellent performance (recommended)
Model Name Validation:
- Strict validation prevents silent fallbacks to wrong models
- Supports both short names (
"tiny","base") and full names ("moonshine/tiny","moonshine/base") - Clear error messages list all valid options when invalid model is specified
- Canonical model names ensure consistent behavior across different input formats
Sample Rate Handling
The Moonshine plugin automatically handles sample rate conversion for optimal transcription quality:
Events
The plugin emits the following events:
-
transcript: Final transcription result
text(str): The transcribed textuser(any): User metadata passed toprocess_audio()metadata(dict): Additional information including model name, duration, etc.
-
error: Error during transcription
error(Exception): The error that occurred
Note: Unlike streaming STT services, Moonshine doesn't emit partial_transcript events as it processes complete audio chunks.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file getstream_plugins_moonshine-0.1.0-py3-none-any.whl.
File metadata
- Download URL: getstream_plugins_moonshine-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7719780ae35817a79b0faa9a06091f7d8b25d76af592d2707cfc8b9443effb9
|
|
| MD5 |
136451cf23bdaa7c01728857a3ab5fb4
|
|
| BLAKE2b-256 |
90710f8b371fc85ed03f8afab8f9cad32251fe8c0b308e345678d57a71e7ad61
|