Use your speech to write to the current caret position!

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

asmith26

These details have not been verified by PyPI

Project description

speech2caret

speech2caret logo

Use your speech to write to the current caret position!

Goals

✅ Simple: A minimalist tool that does one thing well.
✅ Local: Runs entirely on your machine (uses Hugging Face models for speech recognition).
✅ Efficient: Optimised for low CPU and memory usage, thanks to an event-driven architecture that responds instantly to key presses without wasting resources.

Note: Tested only on Linux (Ubuntu). Other operating systems are currently unsupported.

Demo (turn volume on):

demo video

Installation

1. System Dependencies

First, install the required system libraries:

sudo apt update
sudo apt install libportaudio2 ffmpeg

2. Grant Permissions

To read keyboard events and simulate key presses, evdev needs access to your keyboard input device. Add your user to the input group to grant the necessary permissions:

sudo usermod -aG input $USER
newgrp input  # or log out and back in

3. Install and Run

You can install and run speech2caret using pip or uv:

# Install the package
uv add speech2caret  # or pip install speech2caret

# Run the application
speech2caret

Alternatively, you can run it directly without installation using uvx(the --index pytorch-cpu=... flag ensures only CPU packages are downloaded, avoiding GPU-related dependencies):

uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu --from speech2caret speech2caret

Configuration

The first time you run speech2caret, it creates a config file at ~/.config/speech2caret/config.ini.

You’ll need to manually edit it with the following values:

`keyboard_device_path`

This is the path to your keyboard input device. You can find the path either following this, or by running the command below and looking for an entry that ends with -event-kbd.

ls /dev/input/by-path/

`start_stop_key` and `resume_pause_key`

These are the keys you'll use to control the app.

To find the correct name for a key, you can use the provided Python script below. First, ensure you have your keyboard_device_path from the step above, then run this command:

uvx --from evdev python -c '
keyboard_device_path = "PASTE_YOUR_KEYBOARD_DEVICE_PATH_HERE"

from evdev import InputDevice, categorize, ecodes, KeyEvent
dev = InputDevice(keyboard_device_path)
print(f"Listening for key presses on {dev.name}...")
for event in dev.read_loop():
    if event.type == ecodes.EV_KEY:
        key_event = categorize(event)
        if key_event.keystate == KeyEvent.key_down:
            print(f" {key_event.keycode}")
'

Press the keys you wish to use, and their names will be printed to the terminal. For a full list of available key names, see here.

Additional (Optional) Configuration

You can configure audio cues to notify when recording has started, stopped, paused, or resumed. To do this, update the start_recording_audio_path, stop_recording_audio_path, resume_recording_audio_path, and pause_recording_audio_path config variables in ~/.config/speech2caret/config.ini with the absolute paths to your choice of audio files.

Word Replacement

You can define custom word or phrase replacements in the [word_replacement] section of ~/.config/speech2caret/config.ini file. This allows you to automatically substitute specific spoken words with desired text.

For example, to replace "new line" with a newline character or " underscore " with _, you can configure it as follows:

[word_replacement]
"new line" = "\n"
" underscore " = "_"

How to Use

Run the speech2caret command in your terminal.
Press your configured start_stop_key to begin recording.
Press the resume_pause_key to toggle between pausing and resuming.
When you are finished, press the start_stop_key again.
The recorded audio will be transcribed and typed at your current caret position.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

asmith26

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.3.0

Nov 14, 2025

0.3.0rc2 pre-release

Nov 14, 2025

0.3.0rc1 pre-release

Nov 14, 2025

0.2.0

Sep 28, 2025

0.1.2

Sep 9, 2025

0.1.1

Aug 3, 2025

0.1.0

Jul 6, 2025

0.1.0a7 pre-release

Jul 6, 2025

0.1.0a6 pre-release

Jul 6, 2025

0.1.0a5 pre-release

Jul 6, 2025

0.1.0a1 pre-release

Jul 5, 2025

0.0.1

Jul 2, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

speech2caret-0.3.0.tar.gz (7.3 kB view details)

Uploaded Nov 14, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

speech2caret-0.3.0-py3-none-any.whl (9.8 kB view details)

Uploaded Nov 14, 2025 Python 3

File details

Details for the file speech2caret-0.3.0.tar.gz.

File metadata

Download URL: speech2caret-0.3.0.tar.gz
Upload date: Nov 14, 2025
Size: 7.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speech2caret-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`92457870e7d26c4e99fc38ae097a380df32902a658333d84e1dc6d8a754fa021`
MD5	`039ba127270cdcb38d85627c9304c08e`
BLAKE2b-256	`4626f365493f6177587880d1ffe7d5c74fd2c6f1cf917acc5ccb10bfd51c2a83`

See more details on using hashes here.

Provenance

The following attestation bundles were made for speech2caret-0.3.0.tar.gz:

Publisher: publish.yaml on asmith26/speech2caret

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: speech2caret-0.3.0.tar.gz
- Subject digest: 92457870e7d26c4e99fc38ae097a380df32902a658333d84e1dc6d8a754fa021
- Sigstore transparency entry: 701427801
- Sigstore integration time: Nov 14, 2025
Source repository:
- Permalink: asmith26/speech2caret@0653039f3a8dcdebe8916e2fa1f63a34a948c729
- Branch / Tag: refs/heads/main
- Owner: https://github.com/asmith26
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@0653039f3a8dcdebe8916e2fa1f63a34a948c729
- Trigger Event: workflow_dispatch

File details

Details for the file speech2caret-0.3.0-py3-none-any.whl.

File metadata

Download URL: speech2caret-0.3.0-py3-none-any.whl
Upload date: Nov 14, 2025
Size: 9.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for speech2caret-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e797f364bb04a491a27bae32f26efa2f54fa986d44416d7c694f033058768224`
MD5	`dc46b54da2e195962ef0c3bb42eaa2d2`
BLAKE2b-256	`fef8f285e31edcbf0365337da4cffdcffe7f729133c701fe17637af09bbaa810`

See more details on using hashes here.

Provenance

The following attestation bundles were made for speech2caret-0.3.0-py3-none-any.whl:

Publisher: publish.yaml on asmith26/speech2caret

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: speech2caret-0.3.0-py3-none-any.whl
- Subject digest: e797f364bb04a491a27bae32f26efa2f54fa986d44416d7c694f033058768224
- Sigstore transparency entry: 701427815
- Sigstore integration time: Nov 14, 2025
Source repository:
- Permalink: asmith26/speech2caret@0653039f3a8dcdebe8916e2fa1f63a34a948c729
- Branch / Tag: refs/heads/main
- Owner: https://github.com/asmith26
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@0653039f3a8dcdebe8916e2fa1f63a34a948c729
- Trigger Event: workflow_dispatch

speech2caret 0.3.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

speech2caret

Goals

Installation

1. System Dependencies

2. Grant Permissions

3. Install and Run

Configuration

keyboard_device_path

start_stop_key and resume_pause_key

Additional (Optional) Configuration

Word Replacement

How to Use

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`keyboard_device_path`

`start_stop_key` and `resume_pause_key`