OmniSenseVoice
Project description
Omni SenseVoice 🚀
The Ultimate Speech Recognition Solution
Built on SenseVoice, Omni SenseVoice is optimized for lightning-fast inference and precise timestamps—giving you a smarter, faster way to handle audio transcription!
Install
pip3 install OmniSenseVoice
Usage
omnisense transcribe [OPTIONS] AUDIO_PATH
Key Options:
--language
: Automatically detect the language or specify (auto, zh, en, yue, ja, ko
).--textnorm
: Choose whether to apply inverse text normalization (withitn for inverse normalized
orwoitn for raw
).--device-id
: Run on a specific GPU (default: -1 for CPU).--quantize
: Use a quantized model for faster processing.--help
: Display detailed help information.
Benchmark
omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
Optimize | test set | GPU | WER ⬇️ | RTF ⬇️ | Speed Up 🔥 |
---|---|---|---|---|---|
onnx | dev-clean[:100] | NVIDIA L4 GPU | 4.47% | 0.1200 | 1x |
torch | dev-clean[:100] | NVIDIA L4 GPU | 5.02% | 0.0022 | 50x |
onnx fix cudnn |
dev-clean[all] | NVIDIA L4 GPU | 5.60% | 0.0027 | 50x |
torch | dev-clean[all] | NVIDIA L4 GPU | 6.39% | 0.0019 | 50x |
fix cudnn
:cudnn_conv_algo_search: DEFAULT
- With Omni SenseVoice, experience up to 50x faster processing without sacrificing accuracy.
# LibriTTS
DIR=benchmark/data
lhotse download libritts -p dev-clean benchmark/dataLibriTTS
lhotse prepare libritts -p dev-clean benchmark/data/LibriTTS/LibriTTS benchmark/data/manifests/libritts
lhotse cut simple --force-eager -r benchmark/data/manifests/libritts/libritts_recordings_dev-clean.jsonl.gz \
-s benchmark/data/manifests/libritts/libritts_supervisions_dev-clean.jsonl.gz \
benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
omnisense benchmark -s -d --num-workers 2 --device-id 0 --batch-size 10 -
-textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
omnisense benchmark -s --num-workers 4 --device-id 0 --batch-size 16 --textnorm woitn --language en benchmark/data/manifests/libritts/libritts_cuts_dev-clean.jsonl
Contributing 🙌
Step 1: Code Formatting
Set up pre-commit hooks:
pip install pre-commit==3.6.0
pre-commit install
Step 2: Pull Request
Submit your awesome improvements through a PR. 😊
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
OmniSenseVoice-0.1.8.tar.gz
(18.6 kB
view details)
Built Distribution
File details
Details for the file OmniSenseVoice-0.1.8.tar.gz
.
File metadata
- Download URL: OmniSenseVoice-0.1.8.tar.gz
- Upload date:
- Size: 18.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79743789b803052fcd7b05ed35aaf5d62d2a828f77219b5c879f41192849777d |
|
MD5 | 36560a7752ddcfece26785b0803efa01 |
|
BLAKE2b-256 | d1d49b66d77c64b51d966dc759fd78114ab50d68b83904e6ce35c8b91f74acb2 |
File details
Details for the file OmniSenseVoice-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: OmniSenseVoice-0.1.8-py3-none-any.whl
- Upload date:
- Size: 21.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.9.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c4dbbeb8469f3c86ab58897f5dd471c4d301df027a5047c0a95151eb776c6885 |
|
MD5 | b978fa8cd3a38e4f4d7b4f6a5ad2b73a |
|
BLAKE2b-256 | a56ead0f960a347a731e79751ddb23a9d255256aebc41ef1ba204c62a9916e6a |