Skip to main content

No project description provided

Project description

Whisper-dictation

This is an app using openai's whisper to dictate on KDE wayland.

The project is designed as a dictation server that runs at background (To avoid the time to load model each time starts the dictation) and a client to toggle if the server should be recording. You can assign a shortcut to toggle the server to start/stop the recording (whisper_dicatation --language [language_code]).

Whenever the dictation is stopped, the content will be sent to your clipboard and a notification will be displayed.

The project depends on the kdialog package and wl-copy (from wl-clipboard package).

This project is designed to work on KDE wayland. Other wayland platforms might work as well, but without the ability to send a notification.

Installation

Install using pip:

pip install whisper_dictation

Note that it is recommended to install at user directory (not globally). Since the systemd service provided is written only for executable at ~/.local/bin

Usage

To start the project manually, you should use two terminals, for the server:

whisper_dictation daemon [--port 9000] [--model_name base]

See whisper_dictation daemon --help for all the available models

You can use a command to trigger the daemon, or assign a shortcut to this command in order to use it. Press once for start, and press the second time to stop the recording.

whisper_dictation say [--language en]

You should assign a language code, it can help with the performance especially using a small model.

Alternatively, you can use the systemd service unit provided inside this repo to make the daemon running in the background. Place it in your ~/.config/systemd/user/, enable and start it:

systemctl --user enable whisper_dictation
systemctl --user start whisper_dictation

TODO

  • add system integration for a shortcut to start/stop dictation
  • output the dictation to where the cursor is (planned as fcitx addon).
  • optional(A system tray)
  • package it on aur

Requirements

  1. wl-clipboard
  2. kdialog

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

whisper_dictation-0.4.3.tar.gz (17.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

whisper_dictation-0.4.3-py3-none-any.whl (19.9 kB view details)

Uploaded Python 3

File details

Details for the file whisper_dictation-0.4.3.tar.gz.

File metadata

  • Download URL: whisper_dictation-0.4.3.tar.gz
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/6.2.10-zen1-1-zen

File hashes

Hashes for whisper_dictation-0.4.3.tar.gz
Algorithm Hash digest
SHA256 98f524badf3c0aebea7fd8eeeaffa32dd31700640eebae01c23fe091e0c2d51a
MD5 0cad3842d6475ae045a2df85a94a4f8c
BLAKE2b-256 811fc1cb9f99c032104fe8db24893ae76ee5dcd8613ae275cbf4c9b862851611

See more details on using hashes here.

File details

Details for the file whisper_dictation-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: whisper_dictation-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 19.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.4.2 CPython/3.10.10 Linux/6.2.10-zen1-1-zen

File hashes

Hashes for whisper_dictation-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 85ff214cca3ae652e84fe72ace894cd5397077df80eb036b049a32b72c99e292
MD5 8efac0faf898ad6c1a8f755dde03f8f4
BLAKE2b-256 5ab84a2be8c286957751e21fcd274547e6707e5963d38e6cf75293f09f9e34fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page