Skip to main content

Yet another voice memo tool

Project description

push-to-whisper

A smart voice memo tool aka push-to-stt-to-md-to-llm-to-clipboard-or-whatever.

What you can do with push-to-whisper:

  • Record audio while holding a global key combination.
  • Save the recording as a .wav file (e.g., directly into your Obsidian vault).
  • Transcode it into .ogg or other formats for efficiency (via ffmpeg).
  • Transcribe it into Markdown using Whisper (Currently supports whisper.cpp server).
  • Refine the text using LLM APIs like OpenAI, Gemini, or Ollama (via LiteLLM).
    • Auto tagging, auto summarization, etc.
  • Copy the result to your clipboard automatically.
  • Notify success or send results to notification services like Slack, Discord, or Ntfy (via Apprise).

Every step above is modular. You can combine them to build your own custom workflow in a simple YAML configuration file.

Installation

  1. Install system dependencies:

    # Debian/Ubuntu
    sudo apt install libgirepository1.0-dev libcairo2-dev python3-dev ffmpeg
    
  2. Install the package using uv:

    uv tool install push-to-whisper
    
  3. Install the systemd user service and generate a default config:

    push-to-whisper install-daemon
    

Configuration

The configuration file is located at ~/.config/push-to-whisper/config.yaml. You can customize the Whisper endpoint, LLM API keys (LiteLLM), and processing pipelines.

To re-initialize or export the default configuration:

push-to-whisper init --bare -o ~/.config/push-to-whisper/config.yaml

Usage

Once the daemon is installed via install-daemon, it will start automatically on login.

Default Shortcuts

On Linux (KDE/GNOME), shortcuts are managed by the system. After running install-daemon, you can assign keys to the following actions in your system settings:

  • Transcription to Markdown: (Recommended: ALT+SHIFT+x) - Transcribe -> Transcode -> Save Audio -> Save Markdown -> Notify.
  • Transcription to Clipboard: (Recommended: ALT+SHIFT+c) - Transcribe -> Transcode -> Copy to Clipboard -> Notify.

Note: Currently tested and supported only on Linux (Fedora) with KDE Plasma (Wayland). Native support for Windows and macOS is planned for future releases.

Development

  • Formatting: uv run ruff format .
  • Linting: uv run ruff check . --fix
  • Testing: uv run pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

push_to_whisper-0.1.1.tar.gz (25.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

push_to_whisper-0.1.1-py3-none-any.whl (32.9 kB view details)

Uploaded Python 3

File details

Details for the file push_to_whisper-0.1.1.tar.gz.

File metadata

  • Download URL: push_to_whisper-0.1.1.tar.gz
  • Upload date:
  • Size: 25.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 494b625aaf8d035b1423753655af21d2e6d90d8b95698fd3d276bdecd6873f1b
MD5 1399927378dcb7b5eff7516052c33f03
BLAKE2b-256 0d1ad62abafc71f113717b74828883bf41c8c5d7dc712c9096075a14e316988f

See more details on using hashes here.

File details

Details for the file push_to_whisper-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: push_to_whisper-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.10.4 {"installer":{"name":"uv","version":"0.10.4","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 99485bc1789ddad9620a778cd923acc62196bf7e91f9e3177a6e1a3fd56a1549
MD5 de801ba71849ba951b7404b805917f30
BLAKE2b-256 0ee99fa47caa31a8079c34b89f473e0e9c1b425fa2cede59236f0c32726f309a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page