Offline speech to text using VOSK. Based on nerd-dictation.

These details have not been verified by PyPI

Project links

Project description

Pytater

Offline Speech to Text for Linux.

[!IMPORTANT] This project is a fork of ideasman42's Nerd Dictation--where the original is a script meant for easy hacking, this is a full-fledged Python package, meant to provide vastly simpler setup and a Python API on top of the original CLI.

See demo video (from ideasman42).

This is a utility that provides simple access speech to text for using in Linux without being tied to a desktop environment, using the excellent VOSK-API.

Simple to set up
   Pytater can be installed with a single command ~~from PyPi~~ (coming soon).
Configurable
   Configure pytater using config files, environment variables, or the Python API (partially complete).
Zero Overhead
   As pytater is activated manually, there are no background processes.

Usage

It is suggested to bind begin/end/cancel to shortcut keys.

pytater begin

pytater end

For details on how this can be used, see: pytater --help and pytater begin --help.

Features

Specific features include:

Numbers as Digits

Optional conversion from numbers to digits.

So Three million five hundred and sixty second becomes 3,000,562nd.

A series of numbers (such as reciting a phone number) is also supported.

So Two four six eight becomes 2,468.

Time Out

Optionally end speech to text early when no speech is detected for a given number of seconds. (without an explicit call to end which is otherwise required).

Output Type

Output can simulate keystroke events (default) or simply print to the standard output.

User configuration

TODO: fill in this section

Suspend/Resume

Initial load time can be an issue for users on slower systems or with some of the larger language-models. In this case, suspend/resume can be useful. While suspended, all data is kept in memory and the process is stopped. Audio recording is stopped and restarted on resume.

See pytater begin --help for details on how to access these options.

Dependencies

Python 3.6.2 (or newer).
An audio recording utility (parec by default).
An input simulation utility (xdotool by default). (This is not necessary if all you're doing is printing dictated words to the terminal.)

Audio Recording Utilities

You may select one of the following tools.

parec command for recording from pulse-audio.
sox command as alternative, see the guide: Using sox with pytater.
pw-cat command for recording from pipewire.

Input Simulation Utilities

You may select one of the following input simulation utilities.

xdotool command to simulate input in X11.
ydotool command to simulate input anywhere (X11/Wayland/TTYs). See the setup guide: Using ydotool with pytater.
dotool command to simulate input anywhere (X11/Wayland/TTYs).
wtype to simulate input in Wayland".

Install

With pip (not recommended, as this will install it globally):

pip3 https://github.com/paul-c-hartman/pytater.git

Or alternatively, using uv or pipx:

uv tool install --from git+https://github.com/paul-c-hartman/pytater.git pytater
# or:
pipx install git+https://github.com/paul-c-hartman/pytater.git
# This will add a `pytater` command to your PATH

Then download a model. The complete list of models is available here. To do this:

pytater download # to download the default model, or:
pytater download --model large
# Or by URL:
pytater download --model "https://alphacephei.com/path/to/model"

To test dictation:

pytater begin &
# Start speaking.
pytater end

Reminder that it's up to you to bind begin/end/cancel to actions you can easily access (typically key shortcuts).

Details

Typing in results will never press enter/return.
Recording and speech to text is performed in parallel.

Examples

Store the result of speech to text as a variable in the shell:

SPEECH="$(pytater begin --timeout=1.0 --output=STDOUT)"

Limitations

Text from VOSK is all lower-case. While the user configuration can be used to set the case of common words like I, this isn't very convenient.
For some users the delay in start up may be noticeable on systems with slower hard disks especially when running for the 1st time (a cold start). This is a limitation with the choice not to use a service that runs in the background. Recording begins before any the speech-to-text components are loaded to mitigate this problem.

Roadmap

Complete and documented API (partially complete)
Proper extension support using entry points
Reimplement certain features as post-processors
- General solution to capitalize words (proper nouns for example)
Proper logging system
Processing of audio files in addition to live audio
Support Windows & macOS

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Mar 31, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pytater-0.1.0.tar.gz (40.9 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

pytater-0.1.0-py3-none-any.whl (43.3 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file pytater-0.1.0.tar.gz.

File metadata

Download URL: pytater-0.1.0.tar.gz
Upload date: Mar 31, 2026
Size: 40.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pytater-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`da9a6f92f921286f1377d1b47f995fd98376ee5655e555bff031c1da1e296e64`
MD5	`f1679d7e30b3bdca3a0b563585ddcede`
BLAKE2b-256	`df1aad4411ae0eb6eed70448c7c1f68e3a9beecc3bc7982e907959884be39256`

See more details on using hashes here.

File details

Details for the file pytater-0.1.0-py3-none-any.whl.

File metadata

Download URL: pytater-0.1.0-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 43.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for pytater-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`98c2cef02e33fa6b2889e5e31acea9274a2ac2573d33e0dd59909edb66eca379`
MD5	`ba208153fe3e139c104e94615e70339a`
BLAKE2b-256	`b99d30587d5cdfb7c30448b25100a1ae00b9f96232936b013002f67914856dde`

See more details on using hashes here.

pytater 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Pytater

Usage

Features

Numbers as Digits

Time Out

Output Type

User configuration

Suspend/Resume

Dependencies

Audio Recording Utilities

Input Simulation Utilities

Install

Details

Examples

Limitations

Roadmap

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes