Skip to main content

OmniSenseVoice

Project description

Omni SenseVoice 🚀

The Ultimate Speech Recognition Solution

Built on SenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!

Install

pip3 install OmniSenseVoice

Usage

omnisense transcribe [OPTIONS] AUDIO_PATH

Key Options:

  • --language: Automatically detect the language or specify (auto, zh, en, yue, ja, ko).
  • --textnorm: Choose whether to apply inverse text normalization (withitn for inverse normalized or woitn for raw).
  • --device-id: Run on a specific GPU (default: -1 for CPU).
  • --quantize: Use a quantized model for faster processing.
  • --help: Display detailed help information.

Benchmark

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Optimize test set GPU WER ⬇️ RTF ⬇️ Speed Up 🔥
onnx dev-clean[:100] NVIDIA L4 GPU 4.47% 0.1200 1x
torch dev-clean[:100] NVIDIA L4 GPU 5.02% 0.0022 50x
onnx fix cudnn dev-clean[all] NVIDIA L4 GPU 5.60% 0.0027 50x
torch dev-clean[all] NVIDIA L4 GPU 6.39% 0.0019 50x
  • fix cudnn: cudnn_conv_algo_search: DEFAULT
  • With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTS
DIR=benchmark/data
lhotse download libritts -p dev-clean benchmark/dataLibriTTS
lhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/libritts

lhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \
    -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \
    benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 -
-textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Contributing 🙌

Step 1: Code Formatting

Set up pre-commit hooks:

pip install pre-commit==4.2.0
pre-commit install

Step 2: Pull Request

Submit your awesome improvements through a PR. 😊

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omnisensevoice-0.4.2.tar.gz (21.0 kB view details)

Uploaded Source

File details

Details for the file omnisensevoice-0.4.2.tar.gz.

File metadata

  • Download URL: omnisensevoice-0.4.2.tar.gz
  • Upload date:
  • Size: 21.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for omnisensevoice-0.4.2.tar.gz
Algorithm Hash digest
SHA256 172bc7195f44a0f12153d62fd3f9b2bb6b2340111b41d3d433d3e696e590dcb2
MD5 94f9b4541fcb69ba2ebdf43acd94b325
BLAKE2b-256 ef8554b2a3d397eff7672bf8722bc96f42c4f06269d14c562198ae59d0d57dae

See more details on using hashes here.

Provenance

The following attestation bundles were made for omnisensevoice-0.4.2.tar.gz:

Publisher: publish-wheels.yml on lifeiteng/OmniSenseVoice

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page