Skip to main content

NVIDIA MarbleNet vad model for fasr

Project description

fasr-vad-marblenet

基于 NVIDIA MarbleNet ONNX 推理脚本封装的 VAD 插件,为 fasr 提供离线语音活动检测能力。插件已内置 model.onnx,默认可直接加载。

安装

pip install fasr-vad-marblenet

注册模型

注册名 说明
marblenet MarbleNetForVAD NVIDIA MarbleNet 非流式 VAD,ONNX Runtime 推理

使用方式

在流水线中使用

from fasr import AudioPipeline

pipeline = (
    AudioPipeline()
    .add_pipe("detector", model="marblenet", checkpoint_dir="/path/to/onnx_dir")
    .add_pipe("recognizer", model="paraformer")
)

单独使用模型

from fasr.config import registry
from fasr.data import Waveform

model = registry.vad_models.get("marblenet")()
model.from_checkpoint()  # 默认加载插件内置 ONNX

waveform = Waveform.from_file("example.wav")
segments = model.detect(waveform)
for seg in segments:
    print(f"{seg.start_ms}ms - {seg.end_ms}ms")

from_checkpoint 参数

参数 类型 默认值 说明
checkpoint_dir str | Path | None 内置模型目录 模型目录或 .onnx 文件路径;目录下会自动选择首个 .onnx
model_path str | Path | None None 直接指定 .onnx 文件路径,优先级高于 checkpoint_dir
providers list[str] | None ["CPUExecutionProvider"] ONNX Runtime provider 列表
num_threads int 2 ONNX Runtime intra_op_num_threads

VAD 参数

参数 默认值 说明
speaking_score 0.5 语音激活阈值
silence_score 0.5 语音结束阈值
fusion_threshold 0.1 邻近片段合并阈值(秒)
min_speech_duration 0.05 最小语音段时长(秒)
output_frame_length 320 每帧采样点数,默认对应 20ms@16k

依赖

  • fasr
  • numpy >= 1.24
  • onnxruntime >= 1.16.0
  • Python 3.10–3.12

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fasr_vad_marblenet-0.4.0.tar.gz (1.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fasr_vad_marblenet-0.4.0-py3-none-any.whl (1.1 MB view details)

Uploaded Python 3

File details

Details for the file fasr_vad_marblenet-0.4.0.tar.gz.

File metadata

  • Download URL: fasr_vad_marblenet-0.4.0.tar.gz
  • Upload date:
  • Size: 1.1 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_marblenet-0.4.0.tar.gz
Algorithm Hash digest
SHA256 c911f64a53c91b6ebd35d18e36daee4df624966e0ea87e058ea22e03767419bd
MD5 b4f907514467c7e63d1b439252f06c53
BLAKE2b-256 02462cf02a8fc02a1d3980a6eb4f94ed021a829947abc0ca39fd02e15f8c7038

See more details on using hashes here.

File details

Details for the file fasr_vad_marblenet-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: fasr_vad_marblenet-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.11 {"installer":{"name":"uv","version":"0.10.11","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"22.04","id":"jammy","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for fasr_vad_marblenet-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 482c004ecb45a17702c7bbee139dbe42a0911981ec49a02c504851e493fce006
MD5 f0e74ac68993bc3bb64764f76762908e
BLAKE2b-256 4dc9d041f7b07946294b0c6dbb5ff7675d96d0713425cf76e113e07742161a4a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page