AutoModel wrapper for funasr-onnx with unified VAD→ASR→Punc pipeline
Project description
funasr-onnx-automodel
A unified AutoModel interface for funasr-onnx that automatically orchestrates the VAD → ASR → Punc speech recognition pipeline with a single generate() call.
Why this project?
funasr-onnx provides excellent ONNX-based speech recognition models, but using them requires manually loading and chaining multiple models (VAD, ASR, Punc) with different APIs and data formats. This package provides a single unified interface — just like HuggingFace's AutoModel — so you can run the full pipeline in 3 lines of code.
Features
- One interface for all models — AutoModel automatically detects model type (Paraformer / SenseVoice)
- Full pipeline orchestration — VAD segmentation → ASR recognition → Punctuation restoration
- Flexible usage — Use full VAD+ASR+Punc pipeline, or ASR-only mode
- Quantized by default — Uses INT8 quantized models for fast CPU inference
- Bypasses funasr_onnx
__init__.py— Avoids unnecessary heavy imports, faster startup
Installation
pip install funasr-onnx-automodel
Or with uv:
uv pip install funasr-onnx-automodel
Quick Start
Full Pipeline (VAD + ASR + Punc)
from funasr_onnx_automodel import AutoModel
model = AutoModel(
model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx",
vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-onnx",
punc_model="iic/punc_ct-transformer_cn-en-common-vocab471067-large-onnx",
)
results = model.generate("audio.wav")
print(results[0]["text"])
ASR Only (no VAD, no Punc)
from funasr_onnx_automodel import AutoModel
model = AutoModel(
model="iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx",
)
results = model.generate("short_audio.wav")
print(results[0]["text"])
API Reference
AutoModel(model, vad_model=None, punc_model=None, quantize=True, device_id="-1", intra_op_num_threads=4, **kwargs)
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
str |
required | Model ID or local path for ASR model. Model type (Paraformer/SenseVoice) is auto-detected from name. |
vad_model |
str | None |
None |
Model ID or local path for VAD model. If None, ASR runs without VAD segmentation. |
punc_model |
str | None |
None |
Model ID or local path for punctuation model. If None, no punctuation is added. |
quantize |
bool |
True |
Use INT8 quantized ONNX model (model_quant.onnx). |
device_id |
str | int |
"-1" |
Device ID. -1 for CPU inference. |
intra_op_num_threads |
int |
4 |
Number of ONNX Runtime intra-op threads. |
generate(input, **kwargs)
| Parameter | Type | Description |
|---|---|---|
input |
str | list | np.ndarray |
Audio file path, list of paths, or 16kHz float32 numpy array. |
**kwargs |
Passed through to the ASR model. |
Returns: list[dict] — Each dict contains "text" (recognized text) and "timestamp" (word-level timestamps).
Model Download
Models are automatically downloaded from ModelScope on first use. Use the iic/ prefix for ModelScope model IDs. Pre-exported ONNX models have the -onnx suffix.
Recommended Models
| Role | Model ID |
|---|---|
| VAD | iic/speech_fsmn_vad_zh-cn-16k-common-onnx |
| ASR | iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx |
| Punc | iic/punc_ct-transformer_cn-en-common-vocab471067-large-onnx |
Audio Format
Input audio should be 16kHz mono WAV. Use ffmpeg to convert:
ffmpeg -i input.mp3 -ar 16000 -ac 1 output.wav
License
Acknowledgments
- FunASR by Alibaba DAMO Academy / ModelScope
- funasr-onnx for ONNX model runtime
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file funasr_onnx_automodel-0.1.0.tar.gz.
File metadata
- Download URL: funasr_onnx_automodel-0.1.0.tar.gz
- Upload date:
- Size: 5.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4a3cdb1d80b6399fa9b98b4b3ad60b87598f6b6a9d79bfc60ed637e5868f2c2
|
|
| MD5 |
bc6156d04c26daa0db8f0abdddc22c7c
|
|
| BLAKE2b-256 |
7cc43f1eff3b72869f46f34c569ea42da9e27f710b31a826d58462bf14b1bbab
|
File details
Details for the file funasr_onnx_automodel-0.1.0-py3-none-any.whl.
File metadata
- Download URL: funasr_onnx_automodel-0.1.0-py3-none-any.whl
- Upload date:
- Size: 6.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.11.7 {"installer":{"name":"uv","version":"0.11.7","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4439c29f4e38578097f757667d1c2fa7a879f1a073a7ea04d887c6f6fdb7be76
|
|
| MD5 |
2bc0a3d73724f23c0355e8b5588834ce
|
|
| BLAKE2b-256 |
85ce0485b1525b167cae76f2f0032fd14aa5e3abb6e2da0b023f63817aaadcd9
|