Skip to main content

SenseVoice-python: A enterprise-grade open source multi-language asr system from funasr opensource with onnxruntime

Project description

SenseVoice-python with onnx

「简体中文」|「English

SenseVoice是具有音频理解能力的音频基础模型, 包括语音识别(ASR)、语种识别(LID)、语音情感识别(SER)和声学事件分类(AEC)或声学事件检测(AED)。

当前SenseVoice-small支持中、粤、英、日、韩语的多语言语音识别,情感识别和事件检测能力,具有极低的推理延迟。 本项目提供python版的SenseVoice模型所需的onnx环境安装的与推理方式。

使用方式

安装

pip install sensevoice-onnx

# or pip from github
pip install git+https://github.com/lovemefan/SenseVoice-python.git

使用

sensevoice --audio sensevoice/resource/asr_example_zh.wav

第一次使用会自动从huggingface下载,如果下载不下来,可以使用hugginface代理

  • Linux:
export HF_ENDPOINT=https://hf-mirror.com
  • Windows Powershell
$env:HF_ENDPOINT = "https://hf-mirror.com"

或者非入侵方式使用环境变量

HF_ENDPOINT=https://hf-mirror.com sensevoice --audio sensevoice/resource/asr_example_zh.wav
Sense Voice 脚本参数设置

optional arguments:
  -h, --help            show this help message and exit
  -a , --audio_file 设置音频路径
  -dp , --download_path 自定义模型下载路径,默认`sensevoice/resource`
  -d , --device, 使用cpu时为-1,使用gpu(需要安装onnxruntime-gpu)时指定卡号 默认`-1`
                        Device
  -n , --num_threads , 线程数, 默认 `4`
                        Num threads
  -l , --language {auto,zh,en,yue,ja,ko,nospeech} 语音代码,默认`auto`
  --use_itn             是否使用itn
  --use_int8            是否使用int8 量化的onnx模型

结果

2024-07-19 07:22:40,643 INFO [sense_voice_ort_session.py:130] Loading model from /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/embedding.npy
2024-07-19 07:22:40,647 INFO [sense_voice_ort_session.py:133] Loading model /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/sense-voice-encoder.onnx
2024-07-19 07:22:42,755 INFO [sense_voice_ort_session.py:140] Loading /home/runner/work/SenseVoice-python/SenseVoice-python/sensevoice/resource/sense-voice-encoder.onnx takes 2.11 seconds
2024-07-19 07:22:42,786 INFO [sense_voice.py:76] Audio sensevoice/resource/asr_example_zh.wav is 5.58 seconds
2024-07-19 07:22:43,102 INFO [sense_voice.py:81] [0.61s - 5.53s] <|zh|><|NEUTRAL|><|Speech|><|woitn|>欢迎大家来体验达摩院推出的语音识别模型
2024-07-19 07:22:43,102 INFO [sense_voice.py:83] Decoder audio takes 0.31638407707214355 seconds
2024-07-19 07:22:43,103 INFO [sense_voice.py:84] The RTF is 0.05669965538927304.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sensevoice-onnx-1.1.0.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

sensevoice_onnx-1.1.0-py3-none-any.whl (16.8 kB view details)

Uploaded Python 3

File details

Details for the file sensevoice-onnx-1.1.0.tar.gz.

File metadata

  • Download URL: sensevoice-onnx-1.1.0.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for sensevoice-onnx-1.1.0.tar.gz
Algorithm Hash digest
SHA256 1d121db81dac484edf509f3fbd00f111fb059ad3e84cb2c67b1d969729a42080
MD5 5564524435db7a64f5c3ab79d7d84df6
BLAKE2b-256 9eab99e0326bb19cde81ba050fa1b95e8d0fcc266db747245a83ef96fe30eb80

See more details on using hashes here.

File details

Details for the file sensevoice_onnx-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for sensevoice_onnx-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 490ea1c94ef1d7eaf42b93bee508a31fc6235ea1a95660e9dbde9b378cac7921
MD5 fcd13759e3256c8128cdbe4afda097be
BLAKE2b-256 1ae6b548989d5a1c8394d45dae9f5eaec0cf0d32f4d85f33c379a6f4f3581cf1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page