Skip to main content

AutoModel wrapper for funasr-onnx with unified VAD→ASR→Punc pipeline

Project description

funasr-onnx-automodel

PyPI Python License: MIT

A unified AutoModel interface for funasr-onnx that automatically orchestrates the VAD → ASR → Punc speech recognition pipeline with a single generate() call.

Why this project?

funasr-onnx provides excellent ONNX-based speech recognition models, but using them requires manually loading and chaining multiple models (VAD, ASR, Punc) with different APIs and data formats. This package provides a single unified interface — just like HuggingFace's AutoModel — so you can run the full pipeline in 3 lines of code.

Features

  • One interface for all models — AutoModel automatically detects model type (Paraformer / SenseVoice)
  • Full pipeline orchestration — VAD segmentation → ASR recognition → Punctuation restoration
  • Flexible usage — Use full VAD+ASR+Punc pipeline, or ASR-only mode
  • Quantized by default — Uses INT8 quantized models for fast CPU inference
  • Bypasses funasr_onnx __init__.py — Avoids unnecessary heavy imports, faster startup

Installation

pip install funasr-onnx-automodel

Or with uv:

uv pip install funasr-onnx-automodel

Quick Start

Full Pipeline (VAD + ASR + Punc)

from funasr_onnx_automodel import AutoModel

model = AutoModel(
    model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx",
    vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-onnx",
    punc_model="iic/punc_ct-transformer_cn-en-common-vocab471067-large-onnx",
)

results = model.generate("audio.wav")
print(results[0]["text"])

ASR Only (no VAD, no Punc)

from funasr_onnx_automodel import AutoModel

model = AutoModel(
    model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx",
)

results = model.generate("short_audio.wav")
print(results[0]["text"])

API Reference

AutoModel(model, vad_model=None, punc_model=None, quantize=True, device_id="-1", intra_op_num_threads=4, **kwargs)

Parameter Type Default Description
model str required Model ID or local path for ASR model. Model type (Paraformer/SenseVoice) is auto-detected from name.
vad_model str | None None Model ID or local path for VAD model. If None, ASR runs without VAD segmentation.
punc_model str | None None Model ID or local path for punctuation model. If None, no punctuation is added.
quantize bool True Use INT8 quantized ONNX model (model_quant.onnx).
device_id str | int "-1" Device ID. -1 for CPU inference.
intra_op_num_threads int 4 Number of ONNX Runtime intra-op threads.

generate(input, **kwargs)

Parameter Type Description
input str | list | np.ndarray Audio file path, list of paths, or 16kHz float32 numpy array.
**kwargs Passed through to the ASR model.

Returns: list[dict] — Each dict contains "text" (recognized text) and "timestamp" (word-level timestamps).

Model Download

Models are automatically downloaded from ModelScope on first use. Use the iic/ prefix for ModelScope model IDs. Pre-exported ONNX models have the -onnx suffix.

Recommended Models

Role Model ID
VAD iic/speech_fsmn_vad_zh-cn-16k-common-onnx
ASR iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx
Punc iic/punc_ct-transformer_cn-en-common-vocab471067-large-onnx

Audio Format

Input audio should be 16kHz mono WAV. Use ffmpeg to convert:

ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav

License

MIT

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

funasr_onnx_automodel-0.1.0.tar.gz (5.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

funasr_onnx_automodel-0.1.0-py3-none-any.whl (6.0 kB view details)

Uploaded Python 3

File details

Details for the file funasr_onnx_automodel-0.1.0.tar.gz.

File metadata

  • Download URL: funasr_onnx_automodel-0.1.0.tar.gz
  • Upload date:
  • Size: 5.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for funasr_onnx_automodel-0.1.0.tar.gz
Algorithm Hash digest
SHA256 b4a3cdb1d80b6399fa9b98b4b3ad60b87598f6b6a9d79bfc60ed637e5868f2c2
MD5 bc6156d04c26daa0db8f0abdddc22c7c
BLAKE2b-256 7cc43f1eff3b72869f46f34c569ea42da9e27f710b31a826d58462bf14b1bbab

See more details on using hashes here.

File details

Details for the file funasr_onnx_automodel-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: funasr_onnx_automodel-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 6.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for funasr_onnx_automodel-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4439c29f4e38578097f757667d1c2fa7a879f1a073a7ea04d887c6f6fdb7be76
MD5 2bc0a3d73724f23c0355e8b5588834ce
BLAKE2b-256 85ce0485b1525b167cae76f2f0032fd14aa5e3abb6e2da0b023f63817aaadcd9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page