CLI for Omi Med STT v1 medical speech-to-text
Project description
Omi Med STT Runtime
Runtime CLI for Omi Med STT v1, an English medical speech-to-text model.
This repository contains runtime code only. It does not contain model weights, private benchmark data, or training data.
Runtimes
omi-med-stt supports three runtime paths:
| Runtime | Best for | Artifact |
|---|---|---|
mlx |
Apple Silicon Macs | omi-health/omi-med-stt-v1-mlx |
cpp |
Windows, Mac Intel, Linux CPU, and ggml GPU backends | omi-health/omi-med-stt-v1-gguf |
nemo |
NVIDIA CUDA servers and canonical NeMo checkpoint use | omi-health/omi-med-stt-v1 |
The source-of-truth model is the NeMo checkpoint. MLX and GGUF are runtime exports.
Install
From PyPI:
pip install -U omi-med-stt
From this repository:
pip install git+https://github.com/Omi-Health/omi-med-stt-runtime.git
For Apple Silicon / MLX:
pip install "omi-med-stt[mlx] @ git+https://github.com/Omi-Health/omi-med-stt-runtime.git"
For CUDA/Linux NeMo:
pip install "omi-med-stt[nemo] @ git+https://github.com/Omi-Health/omi-med-stt-runtime.git"
Basic Usage
Simple path:
omi-med-stt audio.wav
Explicit MLX:
omi-med-stt audio.wav --runtime mlx
Explicit NeMo:
omi-med-stt audio.wav --runtime nemo
Explicit parakeet.cpp / GGUF:
omi-med-stt audio.wav --runtime cpp
JSON output:
omi-med-stt audio.wav --json
Dependency/runtime check:
omi-med-stt check
parakeet.cpp / GGUF Runtime
The cpp runtime is powered by
parakeet.cpp, a C++/ggml inference
engine for NVIDIA Parakeet ASR models.
Omi Med STT v1 includes a post-Conformer medical adapter. Until this adapter
extension is upstreamed, omi-med-stt builds parakeet.cpp with the adapter
patch included in this repository and caches the resulting parakeet-cli.
Normal use:
omi-med-stt audio.wav --runtime cpp
Pre-build the C++ runtime explicitly:
omi-med-stt install-cpp
Choose a backend:
omi-med-stt install-cpp --cpp-backend cpu
omi-med-stt install-cpp --cpp-backend metal
omi-med-stt install-cpp --cpp-backend cuda
Manual override remains available for developers:
omi-med-stt audio.wav --runtime cpp --parakeet-cli /path/to/parakeet-cli
The cached runtime is built from parakeet.cpp and applies
parakeet-cpp-omi-adapter.patch. You need git and cmake available for the
first build.
The cpp runtime downloads only the selected GGUF file. It does not download
the NeMo .nemo checkpoint or the MLX model.safetensors.
By default, the cpp runtime is pinned to a specific Hugging Face repository
revision and verifies the SHA256 checksum for the official f16/q8_0 GGUF files.
You can override this for experiments:
omi-med-stt audio.wav --runtime cpp --revision main
omi-med-stt audio.wav --runtime cpp --no-verify-checksum
Long Audio
Omi Med STT v1 is based on Parakeet and can handle long-form audio. Start with the simple path:
omi-med-stt consult.wav
An explicit chunked path is still available for constrained environments:
omi-med-stt transcribe-long consult.wav --chunk-seconds 25 --overlap 3
Model Access
The runtime defaults to these Hugging Face model repositories:
omi-health/omi-med-stt-v1omi-health/omi-med-stt-v1-mlxomi-health/omi-med-stt-v1-gguf
If the model repositories are private before launch, authenticate first:
huggingface-cli login
Attribution
This runtime uses or interoperates with:
- NVIDIA NeMo / Parakeet, for the base ASR architecture.
parakeet-mlx, for Apple Silicon MLX inference.parakeet.cpp, for GGUF / C++ / ggml inference.
See NOTICE.md.
License
Runtime code in this repository is MIT licensed.
Model weights are governed separately by the model repositories. Omi Med STT v1
is a derivative of nvidia/parakeet-tdt-0.6b-v2, whose model weights are
licensed under CC-BY-4.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omi_med_stt-0.1.3.tar.gz.
File metadata
- Download URL: omi_med_stt-0.1.3.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e50da4f67f45cb219b0aac984d1c1e3e042b227f8597e3731c12c0f189e4475a
|
|
| MD5 |
5e2b485d24535a1093f33c49f8564b2a
|
|
| BLAKE2b-256 |
251cdc858cc5f66d630370e1ed780f85385d0cee20f698b6252f5c501b369b1c
|
File details
Details for the file omi_med_stt-0.1.3-py3-none-any.whl.
File metadata
- Download URL: omi_med_stt-0.1.3-py3-none-any.whl
- Upload date:
- Size: 17.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3a911f27bf275bd7f58b98140783cffa9ddceb5ac90aea458f4dd1e31cb98c82
|
|
| MD5 |
97b95b7cb78c24a768f24d0a8815d983
|
|
| BLAKE2b-256 |
7bc2dace5d034433e6dbc64e8b1accef1df59818f9ffe89921bdd828ab5bbbd0
|