Skip to main content

Yet another voice memo tool

Project description

push-to-whisper

A smart voice memo tool aka push-to-stt-to-md-to-llm-to-clipboard-or-whatever.

What you can do with push-to-whisper:

  • Record audio while holding a global key combination.
  • Save the recording as a .wav file (e.g., directly into your Obsidian vault).
  • Transcode it into .ogg or other formats for efficiency (via ffmpeg).
  • Transcribe it into Markdown using Whisper (Currently supports whisper.cpp server).
  • Refine the text using LLM APIs like OpenAI, Gemini, or Ollama (via any-llm).
    • Auto tagging, auto summarization, etc.
  • Copy the result to your clipboard automatically.
  • Notify success or send results to notification services like Slack, Discord, or Ntfy (via Apprise).

Every step above is modular. You can combine them to build your own custom workflow in a simple YAML configuration file.

Installation

  1. Install system dependencies:

    # Debian/Ubuntu
    sudo apt install libgirepository1.0-dev libcairo2-dev python3-dev ffmpeg
    
  2. Install the package using uv:

    uv tool install push-to-whisper
    
  3. Install the systemd user service and generate a default config:

    push-to-whisper install-daemon
    

Configuration

The configuration file is located at ~/.config/push-to-whisper/config.yaml. You can customize the Whisper endpoint, LLM API keys (any-llm), and processing pipelines.

To re-initialize or export the default configuration:

push-to-whisper init --bare -o ~/.config/push-to-whisper/config.yaml

Usage

Once the daemon is installed via install-daemon, it will start automatically on login.

Default Shortcuts

On Linux (KDE/GNOME), shortcuts are managed by the system. After running install-daemon, you can assign keys to the following actions in your system settings:

  • Transcription to Markdown: (Recommended: ALT+SHIFT+x) - Transcribe -> Transcode -> Save Audio -> Save Markdown -> Notify.
  • Transcription to Clipboard: (Recommended: ALT+SHIFT+c) - Transcribe -> Transcode -> Copy to Clipboard -> Notify.

Note: Currently tested and supported only on Linux (Fedora) with KDE Plasma (Wayland). Native support for Windows and macOS is planned for future releases.

Development

  • Formatting: uv run ruff format .
  • Linting: uv run ruff check . --fix
  • Testing: uv run pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

push_to_whisper-0.2.0.tar.gz (25.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

push_to_whisper-0.2.0-py3-none-any.whl (33.0 kB view details)

Uploaded Python 3

File details

Details for the file push_to_whisper-0.2.0.tar.gz.

File metadata

  • Download URL: push_to_whisper-0.2.0.tar.gz
  • Upload date:
  • Size: 25.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.2.0.tar.gz
Algorithm Hash digest
SHA256 55b5cbc33abbc1a1ea0283f81ccb972e43c81de19dbd2983123c56360827a627
MD5 8396f42338a9b7e98e82e8329d266a4b
BLAKE2b-256 8db127a4c0281762266c046e2921aef0827af953e38633dccbe6170e42a03315

See more details on using hashes here.

File details

Details for the file push_to_whisper-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: push_to_whisper-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 33.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.1 {"installer":{"name":"uv","version":"0.11.1","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for push_to_whisper-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 54843d73be944636f0c0a3097fceb805b1c39d5a4f24e488160eec87b02462c7
MD5 3b2b13857ef74c1030c8496e878ca0d9
BLAKE2b-256 d3db2bd8d5fe33a3e45ab76218c1c923caa9e9c8aa8c7f4cd4493c82a330da03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page