OmniSenseVoice
Project description
Omni SenseVoice 🚀
The Ultimate Speech Recognition Solution
Built on SenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!
Install
pip3 install OmniSenseVoice
Usage
omnisense transcribe [OPTIONS] AUDIO_PATH
Key Options:
--language: Automatically detect the language or specify (auto, zh, en, yue, ja, ko).--textnorm: Choose whether to apply inverse text normalization (withitn for inverse normalizedorwoitn for raw).--device-id: Run on a specific GPU (default: -1 for CPU).--quantize: Use a quantized model for faster processing.--help: Display detailed help information.
Benchmark
omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
| Optimize | test set | GPU | WER ⬇️ | RTF ⬇️ | Speed Up 🔥 |
|---|---|---|---|---|---|
| onnx | dev-clean[:100] | NVIDIA L4 GPU | 4.47% | 0.1200 | 1x |
| torch | dev-clean[:100] | NVIDIA L4 GPU | 5.02% | 0.0022 | 50x |
onnx fix cudnn |
dev-clean[all] | NVIDIA L4 GPU | 5.60% | 0.0027 | 50x |
| torch | dev-clean[all] | NVIDIA L4 GPU | 6.39% | 0.0019 | 50x |
fix cudnn:cudnn_conv_algo_search: DEFAULT- With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTS
DIR=benchmark/data
lhotse download libritts -p dev-clean benchmark/dataLibriTTS
lhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/libritts
lhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \
-s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \
benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 -
-textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
omnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
Contributing 🙌
Step 1: Code Formatting
Set up pre-commit hooks:
pip install pre-commit==4.2.0
pre-commit install
Step 2: Pull Request
Submit your awesome improvements through a PR. 😊
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file omnisensevoice-0.4.2.tar.gz.
File metadata
- Download URL: omnisensevoice-0.4.2.tar.gz
- Upload date:
- Size: 21.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
172bc7195f44a0f12153d62fd3f9b2bb6b2340111b41d3d433d3e696e590dcb2
|
|
| MD5 |
94f9b4541fcb69ba2ebdf43acd94b325
|
|
| BLAKE2b-256 |
ef8554b2a3d397eff7672bf8722bc96f42c4f06269d14c562198ae59d0d57dae
|
Provenance
The following attestation bundles were made for omnisensevoice-0.4.2.tar.gz:
Publisher:
publish-wheels.yml on lifeiteng/OmniSenseVoice
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
omnisensevoice-0.4.2.tar.gz -
Subject digest:
172bc7195f44a0f12153d62fd3f9b2bb6b2340111b41d3d433d3e696e590dcb2 - Sigstore transparency entry: 756649445
- Sigstore integration time:
-
Permalink:
lifeiteng/OmniSenseVoice@c0a9516b8509aee378e7c9f94ba40ca00823c956 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/lifeiteng
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-wheels.yml@c0a9516b8509aee378e7c9f94ba40ca00823c956 -
Trigger Event:
workflow_dispatch
-
Statement type: