Skip to main content

CLI for Omi Med STT v1 medical speech-to-text

Project description

Omi Med STT Runtime

PyPI Tests License: MIT

Command-line runtime for Omi Med STT v1, an English medical speech-to-text model built from NVIDIA Parakeet TDT 0.6B v2.

The package downloads the right model artifact for your machine and transcribes audio locally.

Install

pip install -U omi-med-stt

Apple Silicon:

pip install -U "omi-med-stt[mlx]"

NVIDIA CUDA / NeMo:

pip install -U "omi-med-stt[nemo]"

Run

omi-med-stt audio.wav

Useful options:

omi-med-stt audio.wav --json
omi-med-stt audio.wav --runtime mlx
omi-med-stt audio.wav --runtime nemo
omi-med-stt audio.wav --runtime cpp
omi-med-stt check

Runtime Choices

Platform Default runtime Model artifact
Apple Silicon mlx omi-health/omi-med-stt-v1-mlx-q8
NVIDIA CUDA nemo omi-health/omi-med-stt-v1
Linux/Windows CPU cpp omi-health/omi-med-stt-v1-gguf

The canonical model is the NeMo checkpoint. MLX and GGUF are runtime exports.

CPU setup:

omi-med-stt install-cpp --cpp-backend cpu
omi-med-stt audio.wav --runtime cpp

The CPU path uses a patched parakeet.cpp runtime and downloads the q8_0 GGUF artifact only. It does not download the NeMo or MLX weights.

Evaluation Snapshot

Private benchmark: OmiMedSTT-Private-v1, 1,513 rows / 7.18 hours of held-out medical audio.

Model WER M-WER Drug M-WER Medical Recall
Omi Med STT v1 NeMo 8.30% 2.37% 4.75% 97.95%
Parakeet TDT 0.6B v2 base 16.45% 8.36% 8.60% 96.20%
Qwen3 ASR 1.7B 10.72% 3.13% 6.11% 97.21%
VibeVoice-ASR 9B 11.10% 1.78% 1.36% 98.71%

Runtime artifacts differ slightly:

Artifact WER M-WER Drug M-WER Medical Recall
NeMo canonical 8.30% 2.37% 4.75% 97.95%
MLX full precision 8.59% 2.65% 5.20% 97.70%
MLX q8 8.61% 2.75% 5.20% 97.63%
GGUF q8_0 9.12% 3.20% 6.33% 97.53%

High-level evaluation summary: docs/evaluation-summary.md.

Model Repositories

If the model repositories are private before launch, authenticate first:

huggingface-cli login

CUDA Note

If --runtime nemo fails with a CUDA driver mismatch, install a PyTorch wheel matching your driver before installing the NeMo extra. For example, on CUDA 12.8 hosts:

pip install torch --index-url https://download.pytorch.org/whl/cu128
pip install -U "omi-med-stt[nemo]"

Development

git clone https://github.com/Omi-Health/omi-med-stt-runtime
cd omi-med-stt-runtime
pip install -e ".[dev]"
python scripts/prepublish_check.py --skip-build
python -m pytest -q tests

Safety

Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage, prescribing, or clinical decision model, and it is not clinically validated. Transcripts must be reviewed before any clinical use.

License And Attribution

Runtime code is MIT licensed.

Model weights are CC-BY-4.0 and are derived from nvidia/parakeet-tdt-0.6b-v2. Omi Med STT v1 is not an NVIDIA model.

The CPU runtime uses parakeet.cpp.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omi_med_stt-0.1.19.tar.gz (39.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omi_med_stt-0.1.19-py3-none-any.whl (25.4 kB view details)

Uploaded Python 3

File details

Details for the file omi_med_stt-0.1.19.tar.gz.

File metadata

  • Download URL: omi_med_stt-0.1.19.tar.gz
  • Upload date:
  • Size: 39.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for omi_med_stt-0.1.19.tar.gz
Algorithm Hash digest
SHA256 3899a276c8e65329c63a9f9fc3fd718d788cd8e510c8c116a383229dd2877f9d
MD5 cb3616e0e123216d93ddbd6c04ca443e
BLAKE2b-256 a1e00182edf353891f4a23d5ea6202954f70e1f4ba9c8a4cef5cdf5fdb95ef0e

See more details on using hashes here.

File details

Details for the file omi_med_stt-0.1.19-py3-none-any.whl.

File metadata

  • Download URL: omi_med_stt-0.1.19-py3-none-any.whl
  • Upload date:
  • Size: 25.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for omi_med_stt-0.1.19-py3-none-any.whl
Algorithm Hash digest
SHA256 70ae90ffc2f6813b0a9f84f54460c0f6b6eaed4ffa1b2385c7e85f6c0cceffab
MD5 2abec54a0f8894341493c1e0c73aeb1a
BLAKE2b-256 bca02e9e47200f6c34726bfbdabf9033dd2f72a4f3a8005d70543fe1b8abad13

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page