Skip to main content

Simple audio transcription tool using Whisper

Project description

* wscribe
** Getting started
*** Installation
Currently only tested on Linux, if you face any installation issues please feel free to [[https://github.com/geekodour/wscribe/issues][create issues]].
**** Set the required environment variables
- Set ~WSCRIBE_MODELS_DIR~ : Path to the directory where whisper models should be downloaded to
#+begin_src bash
export WSCRIBE_MODELS_DIR=$XDG_DATA_HOME/whisper-models # example
#+end_src
**** Download the models
- You can download the models directly [[https://huggingface.co/guillaumekln][from here]] using ~git lfs~, make sure you download/copy them to ~WSCRIBE_MODELS_DIR~
- Otherwise, you can just use the helper script at [[https://github.com/geekodour/wscribe/blob/main/scripts/fw_dw_hf_wo_lfs.sh][scripts/fw_dw_hf_wo_lfs.sh]], just download it and execute it as per its instructions.
**** Install package
Assuming you already have a working python setup
#+begin_src shell
pip install wscribe
#+end_src
** Usage
#+begin_src
Usage: wscribe transcribe [OPTIONS] SOURCE DESTINATION

Transcribes SOURCE to DESTINATION. Where SOURCE can be local path to an
audio file and DESTINATION needs to be a local path to a non-existing file

Options:
-f, --format [json] destication file format, currently only json
is supported [default: json]
-m, --model [small|medium|large-v2]
model should already be downloaded
[default: medium]
-g, --gpu enable gpu, disabled by default
-d, --debug show debug logs
--help Show this message and exit.
#+end_src
#+begin_src shell
wscribe transcribe audio.mp3 transcription.json # cpu
wscribe transcribe video.mp4 transcription.json --gpu # use gpu
#+end_src
** Contributing
All contribution happens through PRs, any contributions is greatly appreciated, bugfixes are welcome, features are welcome, tests are welcome, suggestions & criticism are welcome.
** Roadmap
- [-] Backends/Features
- [X] faster-whisper
- [ ] whisper.cpp
- [ ] Add support for [[https://github.com/guillaumekln/faster-whisper/issues/303][diarization]]
- [ ] Add VAD/other de-noising stuff etc.
- [ ] Other GPU backends other than CUDA?
- [-] Inference UI
- [X] CLI
- [ ] statistics summary? time taken, playback speed vs transcription speed etc.
- [ ] REST API
- [ ] Streamlit UI
- [ ] Would be nice to compare output of multiple models next to each other
- [ ] Editor UI
- [ ] Web based offline editor
- [ ] SRT and JSON editor
- [ ] Play audio(2x/3x) and it'll highlight current text which can be edited
- [ ] With wscribe JSON export, you'd also have the confidence score for each word color coded
- [-] Audio Source
- [X] Local files
- [ ] Youtube link
- [ ] Google drive link
- [-] Distribution
- [X] Python packaging
- [ ] Windows support(?)
- [ ] Package for Nix
- [ ] Package for Arch(AUR)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wscribe-0.1.0.tar.gz (6.0 kB view details)

Uploaded Source

Built Distribution

wscribe-0.1.0-py3-none-any.whl (7.7 kB view details)

Uploaded Python 3

File details

Details for the file wscribe-0.1.0.tar.gz.

File metadata

  • Download URL: wscribe-0.1.0.tar.gz
  • Upload date:
  • Size: 6.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/6.1.35

File hashes

Hashes for wscribe-0.1.0.tar.gz
Algorithm Hash digest
SHA256 515bf65c7f7cb7e56057618be5d8c12589ccfa7f62035c0118ed5e64e9139e48
MD5 f54bae127ffc65236ce4232aed69a229
BLAKE2b-256 b2fa88bc11326a4c101c4604af22d7890247d0da7e345873a5f5e6e061c0ea3f

See more details on using hashes here.

File details

Details for the file wscribe-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: wscribe-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.11 Linux/6.1.35

File hashes

Hashes for wscribe-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 795fbeb0536e5a8251e99388c2a7e1d387b4ea81cc3bf160c207c0cc6491e1f6
MD5 e6b8759605b65af7acb49f9ecd123bd5
BLAKE2b-256 e10988133cd2f2d14af96edc3532250563205e87a1915b78ebf5a96e78d875c6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page