Skip to main content

Yet another voice memo tool

Project description

push-to-whisper

A smart voice memo tool aka push-to-stt-to-md-to-llm-to-clipboard-or-whatever.

What you can do with push-to-whisper:

  • Record audio while holding a global key combination.
  • Save the recording as a .wav file (e.g., directly into your Obsidian vault).
  • Transcode it into .ogg or other formats for efficiency (via ffmpeg).
  • Transcribe it into Markdown using Whisper (Currently supports whisper.cpp server).
  • Refine the text using LLM APIs like OpenAI, Gemini, or Ollama (via any-llm).
    • Auto tagging, auto summarization, etc.
  • Copy the result to your clipboard automatically.
  • Notify success or send results to notification services like Slack, Discord, or Ntfy (via Apprise).

Every step above is modular. You can combine them to build your own custom workflow in a simple YAML configuration file.

Installation

  1. Install system dependencies:

    # Debian/Ubuntu
    sudo apt install libgirepository1.0-dev libcairo2-dev python3-dev ffmpeg
    
  2. Install the package using uv:

    uv tool install push-to-whisper
    
  3. Install the systemd user service and generate a default config:

    push-to-whisper install-daemon
    

Configuration

The configuration file is located at ~/.config/push-to-whisper/config.yaml. You can customize the Whisper endpoint, LLM API keys (any-llm), and processing pipelines.

To re-initialize or export the default configuration:

push-to-whisper init --bare -o ~/.config/push-to-whisper/config.yaml

Usage

Once the daemon is installed via install-daemon, it will start automatically on login.

Default Shortcuts

On Linux (KDE/GNOME), shortcuts are managed by the system. After running install-daemon, you can assign keys to the following actions in your system settings:

  • Transcription to Markdown: (Recommended: ALT+SHIFT+x) - Transcribe -> Transcode -> Save Audio -> Save Markdown -> Notify.
  • Transcription to Clipboard: (Recommended: ALT+SHIFT+c) - Transcribe -> Transcode -> Copy to Clipboard -> Notify.

Note: Currently tested and supported only on Linux (Fedora) with KDE Plasma (Wayland). Native support for Windows and macOS is planned for future releases.

Development

  • Formatting: uv run ruff format .
  • Linting: uv run ruff check . --fix
  • Testing: uv run pytest

Related Projects

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

push_to_whisper-0.2.1.tar.gz (25.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

push_to_whisper-0.2.1-py3-none-any.whl (33.1 kB view details)

Uploaded Python 3

File details

Details for the file push_to_whisper-0.2.1.tar.gz.

File metadata

  • Download URL: push_to_whisper-0.2.1.tar.gz
  • Upload date:
  • Size: 25.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.2.1.tar.gz
Algorithm Hash digest
SHA256 1dc5673034d047a2d4445094bbb05bb3f831e1b7a89587818a4e92c78d18dbf0
MD5 4ac3534bc799d54cb626cdbe2c2f615f
BLAKE2b-256 2e2e201df176b46d007f6894b86ff8f813f655923d03dbe8ba11ca29dec2e50a

See more details on using hashes here.

File details

Details for the file push_to_whisper-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: push_to_whisper-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 33.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.3 {"installer":{"name":"uv","version":"0.11.3","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 caf3aa6a84bde705fe9abb3aabc2f9859d96dd533a1b7fce31542a933d779abb
MD5 7f0105e431657043509b1d34397985e3
BLAKE2b-256 834e5e4921fd7a8414e33ca072ebf5d55ee2b84f2bdeeadd2bc8dd7b5eab06a5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page