Skip to main content

Providing easy-to-use and extensible STT (Speech-To-Text) implementation with Whisper-like ASR (Automatic Speech Recognition) models.

Project description

whisper-streaming

audit.yml

Providing easy-to-use and extensible STT (Speech-To-Text) implementation with Whisper-like ASR ( Automatic Speech Recognition) models.

[!WARNING]
This project is currently in Alpha State. It is probably not stable over a long time and can have unexpected errors. You could help to push this project out of Alpha and Beta state by testing and reviewing it. Thanks!

Index

Appreciation

This project is the result of a rework of the ideas and prototype implementation created by Dominik Macháček, Raj Dabre, Ondřej Bojar (Original Repository). It is neither official nor fully API- and function-compatible with its original implementation.

Please have a look at their publication: ACL Anthology Bibtex citation

@inproceedings{machacek-etal-2023-turning,
    title = "Turning Whisper into Real-Time Transcription System",
    author = "Mach{\'a}{\v{c}}ek, Dominik  and
      Dabre, Raj  and
      Bojar, Ond{\v{r}}ej",
    editor = "Saha, Sriparna  and
      Sujaini, Herry",
    booktitle = "Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: System Demonstrations",
    month = nov,
    year = "2023",
    address = "Bali, Indonesia",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.ijcnlp-demo.3",
    pages = "17--24",
}

With the usage of this project, you agree to the license terms, found in the License chapter. This project is not affiliated with the paper, original implementation or their authors. It is just reimplementing their ideas in a more modern und easier to use and adapt way, respecting the license agreement.

Components

Backend

Currently following backends are implemented:

Receiver

"Receiver" are mechanisms to input data into the ASR model. Out-of-the-box support for:

  • ALSA (Advanced Linux Sound Architecture)
  • File - Audio file

Sender

"Sender" are mechanisms to output data out of the ASR mode. Out-of-the-box support for:

  • Print - Simple console output via "print"
  • WebSocket (Client) - Output via network protocol

Installation

This library can be easily installed with pip:

pip install whisper-streaming

The integration of different backends are installed via following extras:

Development

Installation of Prerequisites

pip install -r requirements/dev/requirements.txt
  • Installation of requirements:
pip install -r requirements/library/requirements.txt
  • backend requirements:
pip install -r requirements/library/requirements_faster_whisper.txt

Build executables

python3 -m build

Documentation

Installation of Prerequisites

pip install -r requirements/docs/requirements.txt

Build

rm -rf docs/_build
sphinx-build -M html docs/ docs/_build

If the developer documentation should be built, the following script can be used:

rm -rf docs/_build
sphinx-build -M html docs/ docs/_build -t Internal

License

This project is published under Apache License, Version 2.0 - please comply with it, if you use/modify/distribute it. The license can be found in "LICENSE". The original implementation is published under MIT and is mentioned at places in this project where it still applies. The license can be found in "LICENSE-MIT". You have to distribute at least these both licenses - in addition to your compliant license.

Appendix

Python venv

python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip setuptools

Links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_streaming-0.1.0.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whisper_streaming-0.1.0-py3-none-any.whl (24.1 kB view details)

Uploaded Python 3

File details

Details for the file whisper_streaming-0.1.0.tar.gz.

File metadata

  • Download URL: whisper_streaming-0.1.0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.11

File hashes

Hashes for whisper_streaming-0.1.0.tar.gz
Algorithm Hash digest
SHA256 60ff4413a2402833ac5f5bd9dbdd7e5dad8adee0846dc26430cc06913b10b572
MD5 4a1118ce28ad7db1cc2bb06e2a3a6997
BLAKE2b-256 0bccf9ae3b024354e5f52dc32bf153a968e8350192c1a8091b45cd8ef5784b48

See more details on using hashes here.

File details

Details for the file whisper_streaming-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for whisper_streaming-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c57735bbae0369bdd895265ec2b9bc841a81c0a1431adc0f93f0a758b1d1178
MD5 a1904b8e495d729e40c1b91dd3f63786
BLAKE2b-256 e10c2bbca91b5039c6b407f4265665f0e8a0a74811c006d2d88778f5d8ae2861

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page