Skip to main content

Fast, accurate, on-device AI library for building interactive voice applications

Project description

Moonshine Voice Python Package

A fast, accurate, on-device AI library for building interactive voice applications. Join our Discord to get help and support.

Installation

pip install moonshine-voice

Quick Start

"""Transcribes live audio from the default microphone"""
import time
from moonshine_voice import (
    MicTranscriber,
    TranscriptEventListener,
    get_model_for_language,
)

# This will download the model files and cache them.
model_path, model_arch = get_model_for_language("en")

# MicTranscriber handles connecting to the microphone, capturing
# the audio data, detecting voice activity, breaking the speech
# up into segments, transcribing the speech, and sending events
# as the results are updated over time.
mic_transcriber = MicTranscriber(
    model_path=model_path, model_arch=model_arch)

# We use an event-driven interface to respond in real time
# as speech is detected.
class TestListener(TranscriptEventListener):
    def on_line_started(self, event):
        print(f"Line started: {event.line.text}")

    def on_line_text_changed(self, event):
        print(f"Line text changed: {event.line.text}")

    def on_line_completed(self, event):
        print(f"Line completed: {event.line.text}")

listener = TestListener()
mic_transcriber.add_listener(listener)
mic_transcriber.start()
print("Listening to the microphone, press Ctrl+C to stop...")

while True:
    time.sleep(0.1)

Other Sources

If you have a different source you're capturing audio from you can supply it directly to a transcriber.

"""Transcribes live audio from an arbitrary audio source."""
from moonshine_voice import (
    Transcriber,
    TranscriptEventListener,
    get_model_for_language,
    load_wav_file,
    get_assets_path,
)
import os
from typing import Iterator, Tuple


def audio_chunk_generator(
    wav_file_path: str, chunk_duration: float = 0.1
) -> Iterator[Tuple[list, int]]:
    """
    Example function that loads a WAV file and yields audio chunks.

    This demonstrates how you can integrate your own proprietary
    audio data capture sources. Replace this function with your own
    implementation that yields (audio_chunk, sample_rate) tuples.

    Args:
        wav_file_path: Path to the WAV file to load
        chunk_duration: Duration of each chunk in seconds

    Yields:
        Tuple of (audio_chunk, sample_rate) where:
        - audio_chunk: List of float audio samples
        - sample_rate: Sample rate in Hz
    """
    audio_data, sample_rate = load_wav_file(wav_file_path)
    chunk_size = int(chunk_duration * sample_rate)

    for i in range(0, len(audio_data), chunk_size):
        chunk = audio_data[i: i + chunk_size]
        yield (chunk, sample_rate)


model_path, model_arch = get_model_for_language("en")

transcriber = Transcriber(
    model_path=model_path, model_arch=model_arch)

stream = transcriber.create_stream(update_interval=0.5)
stream.start()


class TestListener(TranscriptEventListener):
    def on_line_started(self, event):
        print(f"{event.line.start_time:.2f}s: Line started: {event.line.text}")

    def on_line_text_changed(self, event):
        print(
            f"{event.line.start_time:.2f}s: Line text changed: {event.line.text}")

    def on_line_completed(self, event):
        print(f"{event.line.start_time:.2f}s: Line completed: {event.line.text}")


listener = TestListener()
stream.add_listener(listener)

# Feed audio chunks from the generator into the stream.
wav_file_path = os.path.join(get_assets_path(), "two_cities.wav")
for chunk, sample_rate in audio_chunk_generator(wav_file_path):
    stream.add_audio(chunk, sample_rate)

stream.stop()
stream.close()

Multiple Languages

The framework currently supports English, Spanish, Mandarin, Japanese, Korean, Vietnamese, Arabic, and Ukrainian. We are working on wider language support, and you can see which are supported in your version by calling supported_languages(). To use a language, request it using get_model_for_language() passing in the two-letter language code. For example get_model_for_language("es") will download the Spanish models and pass the information you need to create Transcriber objects using them.

Documentation

For more information, see the main Moonshine Voice documentation.

License

The code and English-language models are released under the MIT License - see the main project repository for details. The models used for other languages are released under the Moonshine Community License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

moonshine_voice-0.0.28-py3-none-manylinux_2_39_x86_64.whl (75.0 MB view details)

Uploaded Python 3manylinux: glibc 2.39+ x86-64

moonshine_voice-0.0.28-py3-none-manylinux_2_39_aarch64.whl (73.9 MB view details)

Uploaded Python 3manylinux: glibc 2.39+ ARM64

moonshine_voice-0.0.28-py3-none-macosx_15_0_arm64.whl (66.8 MB view details)

Uploaded Python 3macOS 15.0+ ARM64

File details

Details for the file moonshine_voice-0.0.28-py3-none-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for moonshine_voice-0.0.28-py3-none-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 c220e863194e7c6ea1f6e70245ab30c773de10257e2b5b9f7b1a8cc162bc2d40
MD5 79cc6d96bff090577c2d687a05c649ca
BLAKE2b-256 9b531b68526e278f3b34f4631d0d1ee3fd9d6e769a3d9f8b27f3254d7b2726ec

See more details on using hashes here.

File details

Details for the file moonshine_voice-0.0.28-py3-none-manylinux_2_39_aarch64.whl.

File metadata

File hashes

Hashes for moonshine_voice-0.0.28-py3-none-manylinux_2_39_aarch64.whl
Algorithm Hash digest
SHA256 cad5419ec388a2b56e45c1c1e9f44403f344a88c03bc7d61725c867838c922d3
MD5 cd2f26f8b1f9ff3ca5df80c6d064647c
BLAKE2b-256 e8695e4a9b7524d674df20dbecc4133574477789b3f6b30f0f3961914af25da7

See more details on using hashes here.

File details

Details for the file moonshine_voice-0.0.28-py3-none-macosx_15_0_arm64.whl.

File metadata

File hashes

Hashes for moonshine_voice-0.0.28-py3-none-macosx_15_0_arm64.whl
Algorithm Hash digest
SHA256 0ed4fff3392c4f16d2322093e37ae8bd4ea8ef65a5ce817d7fdbc931b977b62c
MD5 85d1bda816b3d5fa65076fb065a62957
BLAKE2b-256 33c5eed0f9cc2b6c9d1cc2aee34c97582c158aa10ac8281a1bf1d5fa0dd21451

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page