Skip to main content

mstt-A command-line tool and Python library for running speech-to-text against multiple models.

Project description

Modern Speech-to-Text (MSTT)

Modern Speech-to-Text (MSTT) is a Python library designed to provide a unified and extensible interface for various Speech-to-Text (STT) models. It aims to simplify the process of integrating different STT services and models into your applications, offering a consistent API regardless of the underlying STT engine.

Features

  • Unified API: Interact with multiple STT models through a single, consistent interface.
  • Extensible Design: Easily add support for new STT models and services.
  • Local and Cloud Support: Seamlessly switch between local models and cloud-based STT APIs.
  • Plugin System: Integrate custom STT models or enhance existing functionalities via a flexible plugin system.

Installation

To install MSTT, you can use pip:

pip install mstt

If you want to install with specific STT model backends, you can specify them as extras. For example, to install with funasr support:

pip install mstt[funasr]

Usage

Basic Transcription

Here's a basic example of how to use MSTT to transcribe an audio file:

from mstt import MSTT

# Initialize MSTT with a specific model (e.g., 'funasr')
# Ensure the 'mstt-funasr' package is installed if you use 'funasr'
mstt = MSTT(model_id="funasr")

# Transcribe an audio file
audio_file_path = "path/to/your/audio.wav"
result = mstt.transcribe(audio_file_path)

print(f"Transcription: {result.text}")
print(f"Segments: {result.segments}")

Available Models

MSTT supports various models through its plugin system. You can list available models:

from mstt import MSTT

available_models = MSTT.list_models()
print("Available STT Models:")
for model_id, description in available_models.items():
    print(f"- {model_id}: {description}")

Command Line Interface (CLI)

MSTT also provides a command-line interface for quick transcriptions:

mstt transcribe --model funasr --audio path/to/your/audio.wav

Run mstt --help for more CLI options.

Creating Custom Plugins

MSTT is designed to be extensible through a plugin system. You can create your own STT model plugins and register them with MSTT.

Plugin Structure

A plugin typically consists of:

  1. A Model Implementation: A Python class that inherits from mstt.models.STTModel and implements the transcribe method.
  2. A Registration Module: A Python module that registers your model with MSTT using the mstt.register_model decorator.

Example: A Simple Custom Plugin

Let's say you want to create a plugin for a hypothetical MyCustomSTT model. You would create a Python package (e.g., mstt_mycustom):

mstt_mycustom/
├── pyproject.toml
├── src/
│   └── mstt_mycustom/
│       ├── __init__.py
│       ├── models.py
│       └── register.py

src/mstt_mycustom/models.py:

from mstt.models import STTModel, TranscriptionResult

class MyCustomSTTModel(STTModel):
    def __init__(self, model_id: str, device: str = "cpu"):
        super().__init__(model_id, device)
        # Initialize your custom model here
        print(f"Initializing MyCustomSTTModel with ID: {model_id} on device: {device}")

    def transcribe(self, audio_file_path: str) -> TranscriptionResult:
        # Implement your transcription logic here
        # This is a placeholder for demonstration
        print(f"Transcribing {audio_file_path} using MyCustomSTTModel")
        dummy_text = "This is a custom transcription result."
        dummy_segments = [
            {"start": 0.0, "end": 2.0, "text": "This is a custom"},
            {"start": 2.1, "end": 4.0, "text": "transcription result."}
        ]
        return TranscriptionResult(text=dummy_text, segments=dummy_segments)

src/mstt_mycustom/register.py:

from mstt import register_model
from .models import MyCustomSTTModel

@register_model("mycustom")
def register_mycustom_model():
    return MyCustomSTTModel

pyproject.toml (important for plugin discovery):

[project.entry-points.mstt]
mycustom = "mstt_mycustom.register"

Installing Your Plugin

After setting up your plugin package, you can install it in editable mode for development:

pip install -e /path/to/your/mstt_mycustom

Or, if you package it, install it like any other Python package:

pip install mstt-mycustom

Once installed, MSTT will automatically discover and load your mycustom model, and you can use it like any other built-in model:

from mstt import MSTT

mstt = MSTT(model_id="mycustom")
result = mstt.transcribe("path/to/your/audio.wav")
print(result.text)

Contributing

We welcome contributions to MSTT! If you'd like to contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/your-feature-name).
  3. Make your changes.
  4. Write and run tests (pytest).
  5. Commit your changes (git commit -am 'Add new feature').
  6. Push to the branch (git push origin feature/your-feature-name).
  7. Create a new Pull Request.

License

This project is licensed under the Apache License, Version 2.0 - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mstt-0.2.0.tar.gz (6.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mstt-0.2.0-py3-none-any.whl (8.8 kB view details)

Uploaded Python 3

File details

Details for the file mstt-0.2.0.tar.gz.

File metadata

  • Download URL: mstt-0.2.0.tar.gz
  • Upload date:
  • Size: 6.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for mstt-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ebb97872dff1a0a78675a9141721c1d665a50739705ca6e5c52cabe90a0cbd46
MD5 4c9259721d07f2d45be546da65794e9e
BLAKE2b-256 a1b77851305458e1407c41a4dd37c3c3783ce7388a41a3e2a88e01a501b9107e

See more details on using hashes here.

File details

Details for the file mstt-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: mstt-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 8.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.13

File hashes

Hashes for mstt-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 95b8dc2284e79334cfdce471144cb0e1f06f2370320c93d108e789a427b7d2ee
MD5 6c348044c7c2a630b2f5ecf9437372ae
BLAKE2b-256 2d5321eafe544bac82236d40215ca7b13a00eb85d6a3d5efb2c269a9b92e865c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page