Skip to main content

A lightweight Python package for Automatic Speech Recognition using ONNX models

Project description

ONNX ASR

PyPI - Version PyPI - Downloads PyPI - Python Version PyPI - Types PyPI - License
uv Ruff mypy Material for MkDocs CodeFactor Grade Codecov GitHub - CI

onnx-asr is a Python package for Automatic Speech Recognition using ONNX models. It's a lightweight, fast, and easy-to-use pure Python package with minimal dependencies (no need for PyTorch, Transformers, or FFmpeg):

numpy onnxruntime huggingface-hub

Key features of onnx-asr include:

  • Supports many modern ASR models
  • Runs on a wide range of devices, from small IoT/edge devices to servers with powerful GPUs (benchmarks)
  • Works on Windows, Linux, and macOS on x86 and Arm CPUs, with support for CUDA, TensorRT, CoreML, DirectML, ROCm, and WebGPU
  • Supports NumPy versions from 1.22 to 2.4+ and Python versions from 3.10 to 3.14
  • Loads models from Hugging Face or local directories, including quantized versions
  • Accepts WAV files or NumPy arrays, with built-in file reading and resampling
  • Supports custom models (see the Conversion Guide for instructions)
  • Supports batch processing
  • Supports long-form recognition using VAD (Voice Activity Detection)
  • Can return token-level timestamps and log probabilities
  • Provides a fully typed and well-documented Python API
  • Provides a simple command-line interface (CLI)

[!NOTE] Supports Parakeet v2 (En) / v3 (Multilingual), Canary v1/v2 (Multilingual) and GigaAM v2/v3 (Ru) models!

[!WARNING] onnxruntime 1.24.1 has known compatibility issues with onnx-asr. Please use newer (or older) versions!

[!TIP] You can check the onnx-asr demo on HF Spaces:

Open in Spaces

Quickstart

Install onnx-asr:

pip install onnx-asr[cpu,hub]

Load a model and recognize a WAV file:

import onnx_asr

# Load the Parakeet TDT v3 model from Hugging Face (may take a few minutes)
model = onnx_asr.load_model("nemo-parakeet-tdt-0.6b-v3")

# Recognize speech and print result
result = model.recognize("test.wav")
print(result)

[!WARNING] The maximum audio length for most models is 20-30 seconds. For longer audio, VAD can be used.

For more examples, see the Usage Guide.

See the Installation Guide for detailed installation instructions.

Supported Model Architectures

The package supports the following modern ASR model architectures (see the supported model names for the full list of models and comparison with original implementations):

  • NVIDIA NeMo Conformer/FastConformer/Parakeet/Canary (with CTC, RNN-T, TDT and Transformer decoders)
  • GigaChat GigaAM v2/v3 (with CTC and RNN-T decoders, including E2E versions)
  • Kaldi Icefall Zipformer (with stateless RNN-T decoder) including Alpha Cephei Vosk 0.52+
  • T-Tech T-one (with CTC decoder, no streaming support yet)
  • OpenAI Whisper

When saving these models in ONNX format, usually only the encoder and decoder are saved. To run them, the corresponding preprocessor and decoding must be implemented. Therefore, the package contains these implementations for all supported models:

  • Log-mel spectrogram preprocessors
  • Greedy search decoding

Benchmarks

Inverse Real-Time Factor (RTFx): the ratio of audio duration to processing time. RTFx > 1 means processing faster than real-time (higher RTFx values indicate better performance).

Model 9800X3D CPU (RTFx) Cortex A53 CPU (RTFx) T4 CUDA (RTFx) RTX 5070 Ti TensorRT (RTFx)
NeMo Parakeet v2/v3 36 1.0 57 320
NeMo Canary v2 8 N/A 21 36
GigaAM v3 CTC 59 1.6 84 1370
GigaAM v3 RNN-T 43 1.5 40 130

See the Benchmarks page for detailed performance benchmarks.

Troubleshooting / FAQ

See the Troubleshooting Guide for common issues and solutions.

For more help, check the GitHub Issues or open a new one.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

onnx_asr-0.11.0.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

onnx_asr-0.11.0-py3-none-any.whl (138.3 kB view details)

Uploaded Python 3

File details

Details for the file onnx_asr-0.11.0.tar.gz.

File metadata

  • Download URL: onnx_asr-0.11.0.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for onnx_asr-0.11.0.tar.gz
Algorithm Hash digest
SHA256 57ad8d9571dc17db95f0daf9ba432b9472383de320c610735850e56b5375a37d
MD5 f025ee874f1c303ca564b821a5970b2c
BLAKE2b-256 78f6b154881761a593312f509522f99542acffa2516f7a1df6ddf5660ad4a162

See more details on using hashes here.

File details

Details for the file onnx_asr-0.11.0-py3-none-any.whl.

File metadata

  • Download URL: onnx_asr-0.11.0-py3-none-any.whl
  • Upload date:
  • Size: 138.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.12 {"installer":{"name":"uv","version":"0.10.12","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for onnx_asr-0.11.0-py3-none-any.whl
Algorithm Hash digest
SHA256 142d8b3ce7716684992826a269304f5ce9cf1c0fe704b751358e223f45d2a5cf
MD5 00a2e42cbd6216a3a3fc1ecd9c7fc3de
BLAKE2b-256 8204bdffd682cc38b43144b6528186c80451f219a05e3fd0eb331a548f455b9a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page