Providing easy-to-use and extensible STT (Speech-To-Text) implementation with Whisper-like ASR (Automatic Speech Recognition) models.
Project description
whisper-streaming
Providing easy-to-use and extensible STT (Speech-To-Text) implementation with Whisper-like ASR ( Automatic Speech Recognition) models.
[!WARNING]
This project is currently in Alpha State. It is probably not stable over a long time and can have unexpected errors. You could help to push this project out of Alpha and Beta state by testing and reviewing it. Thanks!
Index
Appreciation
This project is the result of a rework of the ideas and prototype implementation created by Dominik Macháček, Raj Dabre, Ondřej Bojar (Original Repository). It is neither official nor fully API- and function-compatible with its original implementation.
Please have a look at their publication: ACL Anthology Bibtex citation
@inproceedings{machacek-etal-2023-turning,
title = "Turning Whisper into Real-Time Transcription System",
author = "Mach{\'a}{\v{c}}ek, Dominik and
Dabre, Raj and
Bojar, Ond{\v{r}}ej",
editor = "Saha, Sriparna and
Sujaini, Herry",
booktitle = "Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: System Demonstrations",
month = nov,
year = "2023",
address = "Bali, Indonesia",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.ijcnlp-demo.3",
pages = "17--24",
}
With the usage of this project, you agree to the license terms, found in the License chapter. This project is not affiliated with the paper, original implementation or their authors. It is just reimplementing their ideas in a more modern und easier to use and adapt way, respecting the license agreement.
Components
Backend
Currently following backends are implemented:
Receiver
"Receiver" are mechanisms to input data into the ASR model. Out-of-the-box support for:
- ALSA (Advanced Linux Sound Architecture)
- File - Audio file
Sender
"Sender" are mechanisms to output data out of the ASR mode. Out-of-the-box support for:
- Print - Simple console output via "print"
- WebSocket (Client) - Output via network protocol
Installation
This library can be easily installed with pip:
pip install whisper-streaming
The integration of different backends are installed via following extras:
- [all] - installs all backends
- [faster-whisper] - Faster-Whisper
Development
Installation of Prerequisites
- Python 3 (latest) (https://www.python.org/downloads)
- Python venv (optional, recommended) GoTo Installation
- Installation of development requirements:
pip install -r requirements/dev/requirements.txt
- Installation of requirements:
pip install -r requirements/library/requirements.txt
- backend requirements:
pip install -r requirements/library/requirements_faster_whisper.txt
Build executables
python3 -m build
Documentation
Installation of Prerequisites
- Python 3 (latest) (https://www.python.org/downloads)
- Python venv (optional, recommended) GoTo Installation
- Installation of requirements:
pip install -r requirements/docs/requirements.txt
Build
rm -rf docs/_build
sphinx-build -M html docs/ docs/_build
If the developer documentation should be built, the following script can be used:
rm -rf docs/_build
sphinx-build -M html docs/ docs/_build -t Internal
License
This project is published under Apache License, Version 2.0 - please comply with it, if you use/modify/distribute it. The license can be found in "LICENSE". The original implementation is published under MIT and is mentioned at places in this project where it still applies. The license can be found in "LICENSE-MIT". You have to distribute at least these both licenses - in addition to your compliant license.
Appendix
Python venv
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip setuptools
Links
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file whisper_streaming-0.1.0.tar.gz.
File metadata
- Download URL: whisper_streaming-0.1.0.tar.gz
- Upload date:
- Size: 18.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60ff4413a2402833ac5f5bd9dbdd7e5dad8adee0846dc26430cc06913b10b572
|
|
| MD5 |
4a1118ce28ad7db1cc2bb06e2a3a6997
|
|
| BLAKE2b-256 |
0bccf9ae3b024354e5f52dc32bf153a968e8350192c1a8091b45cd8ef5784b48
|
File details
Details for the file whisper_streaming-0.1.0-py3-none-any.whl.
File metadata
- Download URL: whisper_streaming-0.1.0-py3-none-any.whl
- Upload date:
- Size: 24.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4c57735bbae0369bdd895265ec2b9bc841a81c0a1431adc0f93f0a758b1d1178
|
|
| MD5 |
a1904b8e495d729e40c1b91dd3f63786
|
|
| BLAKE2b-256 |
e10c2bbca91b5039c6b407f4265665f0e8a0a74811c006d2d88778f5d8ae2861
|