Skip to main content

OmniSenseVoice

Project description

Omni SenseVoice 🚀

The Ultimate Speech Recognition Solution

Built on SenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!

Install

pip3 install OmniSenseVoice

Usage

omnisense transcribe [OPTIONS] AUDIO_PATH

Key Options:

  • --language: Automatically detect the language or specify (auto, zh, en, yue, ja, ko).
  • --textnorm: Choose whether to apply inverse text normalization (withitn for inverse normalized or woitn for raw).
  • --device-id: Run on a specific GPU (default: -1 for CPU).
  • --quantize: Use a quantized model for faster processing.
  • --help: Display detailed help information.

Benchmark

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Optimize test set GPU WER ⬇️ RTF ⬇️ Speed Up 🔥
onnx dev-clean[:100] NVIDIA L4 GPU 4.47% 0.1200 1x
torch dev-clean[:100] NVIDIA L4 GPU 5.02% 0.0022 50x
onnx fix cudnn dev-clean[all] NVIDIA L4 GPU 5.60% 0.0027 50x
torch dev-clean[all] NVIDIA L4 GPU 6.39% 0.0019 50x
  • fix cudnn: cudnn_conv_algo_search: DEFAULT
  • With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTS
DIR=benchmark/data
lhotse download libritts -p dev-clean benchmark/dataLibriTTS
lhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/libritts

lhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \
    -s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \
    benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 -
-textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

omnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl

Contributing 🙌

Step 1: Code Formatting

Set up pre-commit hooks:

pip install pre-commit==3.6.0
pre-commit install

Step 2: Pull Request

Submit your awesome improvements through a PR. 😊

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

OmniSenseVoice-0.1.8.tar.gz (18.6 kB view details)

Uploaded Source

Built Distribution

OmniSenseVoice-0.1.8-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file OmniSenseVoice-0.1.8.tar.gz.

File metadata

  • Download URL: OmniSenseVoice-0.1.8.tar.gz
  • Upload date:
  • Size: 18.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.12

File hashes

Hashes for OmniSenseVoice-0.1.8.tar.gz
Algorithm Hash digest
SHA256 79743789b803052fcd7b05ed35aaf5d62d2a828f77219b5c879f41192849777d
MD5 36560a7752ddcfece26785b0803efa01
BLAKE2b-256 d1d49b66d77c64b51d966dc759fd78114ab50d68b83904e6ce35c8b91f74acb2

See more details on using hashes here.

File details

Details for the file OmniSenseVoice-0.1.8-py3-none-any.whl.

File metadata

File hashes

Hashes for OmniSenseVoice-0.1.8-py3-none-any.whl
Algorithm Hash digest
SHA256 c4dbbeb8469f3c86ab58897f5dd471c4d301df027a5047c0a95151eb776c6885
MD5 b978fa8cd3a38e4f4d7b4f6a5ad2b73a
BLAKE2b-256 a56ead0f960a347a731e79751ddb23a9d255256aebc41ef1ba204c62a9916e6a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page