Moonshine ASR (speech-to-text) community integration for Pipecat
Project description
pipecat-moonshine
Moonshine ASR speech-to-text integration for Pipecat.
Moonshine is a family of small, fast automatic-speech-recognition models optimized for resource-constrained devices. The Tiny English model is roughly 26 M parameters, the Base English model roughly 58 M — both run on CPU via ONNX Runtime with no GPU required. That makes Moonshine an attractive choice for low-latency, on-device transcription in Pipecat pipelines that already handle VAD upstream.
This package provides MoonshineSTTService, a SegmentedSTTService
subclass that plugs straight into any Pipecat pipeline.
Status
Community-maintained integration. See Pipecat's Community Integrations guide for what that means — in short, the Pipecat team does not maintain or support this package; please file issues here.
Tested with Pipecat v1.2.1.
Installation
pip install pipecat-moonshine
This pulls in useful-moonshine-onnx
and pipecat-ai automatically. The first time you instantiate the service,
the chosen model weights are downloaded from Hugging Face
(UsefulSensors/moonshine) and cached locally.
For the included foundational example you also need the local-audio extras:
pip install 'pipecat-moonshine[examples]'
Usage
from pipecat.pipeline.pipeline import Pipeline
from pipecat_moonshine import MoonshineSTTService, Model
stt = MoonshineSTTService(model=Model.TINY_EN)
pipeline = Pipeline([
transport.input(),
vad_processor, # MUST run upstream of the STT — see below
stt,
# ... downstream processors
])
MoonshineSTTService subclasses SegmentedSTTService, so a VAD-driven
processor (e.g. VADProcessor with SileroVADAnalyzer) must produce
VADUserStartedSpeakingFrame / VADUserStoppedSpeakingFrame upstream of
it. Each detected speech segment is decoded into a single final
TranscriptionFrame — Moonshine does not emit interim results.
Configuring the model at runtime
Pass an explicit model or a fully-built settings object:
# Convenience kwarg
stt = MoonshineSTTService(model=Model.BASE_EN)
# Or via Settings (e.g. when you want to update at runtime)
stt = MoonshineSTTService(
settings=MoonshineSTTService.Settings(model="moonshine/base"),
)
Running the example
git clone https://github.com/ubopod/pipecat-moonshine
cd pipecat-moonshine
pip install -e '.[examples]'
python examples/transcription-moonshine.py
Speak into your default mic; lines like Transcription: hello world will be
printed for each detected utterance.
Audio format requirements
Moonshine expects 16 kHz, mono, 16-bit signed PCM input. Pipecat's
default LocalAudioTransport and most WebRTC transports already provide
this. If your pipeline runs at a different sample rate the service will log
a warning on the first segment and transcription quality may degrade — add
a resampler upstream if you need a different rate.
Moonshine also enforces a per-segment duration window: speech segments
shorter than 0.1 s or longer than 64 s are silently dropped (the service
logs at DEBUG level when this happens).
Supported models
| Constant | Model name | Params | Notes |
|---|---|---|---|
Model.TINY_EN |
moonshine/tiny |
26 M | English-only, MIT-licensed weights. |
Model.BASE_EN |
moonshine/base |
58 M | English-only, MIT-licensed weights. |
Multilingual models — important license note
Moonshine also publishes multilingual checkpoints (Spanish, Japanese, Arabic, Korean, Mandarin, Vietnamese, Ukrainian, …). Those weights are released under the Moonshine Community License, which is non-commercial.
For that reason they are intentionally not enumerated in the Model
enum. If you want to use one you must:
-
Read and accept the upstream Moonshine Community License.
-
Pass the model name as a string explicitly, e.g.:
stt = MoonshineSTTService(model="moonshine/base") # then load the appropriate language checkpoint via your own download flow
This package does not bundle, mirror, or auto-download non-commercial weights, and the maintainers make no representation that doing so complies with your use case.
Frames
| In | Out |
|---|---|
VADUserStartedSpeakingFrame |
(no output — buffers audio internally) |
VADUserStoppedSpeakingFrame |
one TranscriptionFrame per segment (final), or nothing |
| Any non-VAD audio | buffered/forwarded according to SegmentedSTTService |
Errors during transcription are pushed as ErrorFrames; the pipeline is
not torn down so other services can continue.
Metrics
can_generate_metrics() returns True. Per-segment processing time is
recorded via start_processing_metrics / stop_processing_metrics, so
enabling metrics on your PipelineTask will surface Moonshine latency
alongside the rest of your pipeline.
Maintainer
Community-maintained. Not affiliated with Moonshine AI or Daily.
License
BSD-2-Clause — see LICENSE. Note that the Moonshine model weights are governed by their own license (MIT for English models, Moonshine Community License for others) — see the section above.
Versioning and changelog
See CHANGELOG.md. This package follows semantic versioning.
Getting help
- Pipecat Discord: https://discord.gg/pipecat (
#community-integrations) - Pipecat changelog (track upstream changes that may affect this integration): https://github.com/pipecat-ai/pipecat/blob/main/CHANGELOG.md
- Issues for this integration: file them in this repo.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pipecat_moonshine-0.1.0.tar.gz.
File metadata
- Download URL: pipecat_moonshine-0.1.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1fbdaa17d5a4182847f1c3752c2abaacbe42de84f9375dbb1e2b86c3c97b98c2
|
|
| MD5 |
cd4b649dd10a0c7fef9c2025838812d2
|
|
| BLAKE2b-256 |
f66bd1788176d48dc30f44d648a54bacb54293b5084bcc377c52edee6bb673e4
|
File details
Details for the file pipecat_moonshine-0.1.0-py3-none-any.whl.
File metadata
- Download URL: pipecat_moonshine-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.14 {"installer":{"name":"uv","version":"0.11.14","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
32b72d0c4a4def3368f91a0bb36871e3bd41367e7dda915e08aa49e0ae894f31
|
|
| MD5 |
076eba16bf77b88b02658aabfdfd4966
|
|
| BLAKE2b-256 |
fcfb178cb39df5d2ee1ecb9d5b0d82cbc97e453454d85799f89c7b2023a1dcc6
|