Transcribe long audio files with ASR or use the streaming interface

Project description

Listen: STT Services

This program is composed of two parts:

A server aimed to be runned as a background service to serve ASR models within the bounds of a socket.
A client to query the models to transcribe audio from files or directly from a live microphone stream.

The outputed wav file can be stored for later use.

You can then use the data.helper script to verify the transcription of every wav file and update the CSV training register before you start training a model.

Requirements

python-pyaudio

Installation

Once you have a working pyaudio for your version of python, install listen.

pip install stt-listen
# Or from source
pip install git+https://gitlab.com/waser-technologies/technologies/listen.git

Usage

❯ listen --help
usage: listen [-h] [-f FILE] [--aggressive {0,1,2,3}] [-d MIC_DEVICE]
                   [-w SAVE_WAV]

Transcribe long audio files using webRTC VAD or use the streaming interface
from a microphone

options:
  -h, --help            show this help message and exit
  -f FILE, --file FILE  Path to the audio file to run (WAV format)
  --aggressive {0,1,2,3}
                        Determines how aggressive filtering out non-speech is.
                        (Integer between 0-3)
  -d MIC_DEVICE, --mic_device MIC_DEVICE
                        Device input index (Int) as listed by
                        pyaudio.PyAudio.get_device_info_by_index(). If not
                        provided, falls back to PyAudio.get_default_device().
  -w SAVE_WAV, --save_wav SAVE_WAV
                        Path to directory where to save recorded sentences
  --debug               Show debug info

Start the server

To use listen, you need a socket with STT models at the ready.

Example to enable as service.

cp ./listen.service.example /usr/lib/systemd/user/listen.service
systemctl --user enable --now listen.service

Models for faster-whisper will be downloaded the first time your run the server.

Or manually with uvicorn.

uvicorn listen.Whisper.as_service:app --reload --port 5063

Get authorization to listen

You need to authorize the system to listen first. Change the service configuration as follows.

# ~/.assistant/stt.toml
...
[stt]
is_allowed = true
...

Then start the server and use listen to start transcribing audio.

Use the client

Transcribe a file

You can quickly transcribe a wav file.

❯ listen -f audio.wav
Filename                       Duration(s)         
audio.wav                      3.580               

❯ cat audio.txt
───────┬───────────────────────────────────────────────────────────────────────────────────────────────────────────────
       │ File: audio.txt
───────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1   │ Bonjour.
───────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────

Transcribe from a live microphone stream

You can also query the models in real time from a microphone.

❯ listen
You can speak now.
Bonjour.
^C
Stopped listening.

Supported languages

By default, the server uses the system's language according to the environment variable $LANG.

You can manually specify a supported language for the server to use.

LANG="en_US.UTF-8" uvicorn listen.Whisper.as_service:app --reload --port 5063

This will look for a good model for this language.

You can also directly specify a model to load.

ASR_MODEL_ID="bofenghuang/whisper-large-v3-french" uvicorn listen.Whisper.as_service:app --reload --port 5063

Project details

Release history Release notifications | RSS feed

This version

4.0.1b2 pre-release

Apr 27, 2026

4.0.1b1 pre-release

Jul 2, 2024

4.0.0b1 pre-release

Jul 2, 2024

3.1.1a1 pre-release

Jun 7, 2024

3.1.0a1 pre-release

Jan 25, 2024

3.0.3a4 pre-release

Jan 24, 2024

3.0.3a3 pre-release

Jan 24, 2024

3.0.3a2 pre-release

Jan 24, 2024

3.0.3a1 pre-release

Jan 24, 2024

3.0.2a6 pre-release

Jan 24, 2024

3.0.2a5 pre-release

Jan 24, 2024

3.0.2a4 pre-release

Jan 24, 2024

3.0.2a3 pre-release

Jan 24, 2024

3.0.2a2 pre-release

Jan 24, 2024

3.0.2a1 pre-release

Jan 24, 2024

3.0.1a2 pre-release

Dec 26, 2023

3.0.1a1 pre-release

Dec 26, 2023

3.0.1a0 pre-release

Dec 1, 2023

2.4.2

Nov 24, 2022

2.4.1

Nov 24, 2022

2.3.42

Nov 2, 2022

2.3.41

Nov 2, 2022

2.3.4

Nov 2, 2022

2.3.3

Nov 2, 2022

2.3.2

Nov 2, 2022

2.3.1

Nov 1, 2022

2.3.0

Nov 1, 2022

2.2.0

Oct 23, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stt_listen-4.0.1b2.tar.gz (39.7 kB view details)

Uploaded Apr 27, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

stt_listen-4.0.1b2-py3-none-any.whl (37.6 kB view details)

Uploaded Apr 27, 2026 Python 3

File details

Details for the file stt_listen-4.0.1b2.tar.gz.

File metadata

Download URL: stt_listen-4.0.1b2.tar.gz
Upload date: Apr 27, 2026
Size: 39.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for stt_listen-4.0.1b2.tar.gz
Algorithm	Hash digest
SHA256	`b9c7fe60ea205dc5ff29ad12c18746146c90825291d73ba67a2e84e643bee8cc`
MD5	`49e0794a0c55af476527c6f6b677496e`
BLAKE2b-256	`b222cbe7fc2769cd0f4a32dbaf25e37a3988bbef44df1fbc005b7cc6a8bb56eb`

See more details on using hashes here.

File details

Details for the file stt_listen-4.0.1b2-py3-none-any.whl.

File metadata

Download URL: stt_listen-4.0.1b2-py3-none-any.whl
Upload date: Apr 27, 2026
Size: 37.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for stt_listen-4.0.1b2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1585a5b67722fb44aeb42e21de994dd39eba53d6cc1cf3b35f6a34739fb0f692`
MD5	`4ac3ab4f7f476c85d7b463ad7ecdde60`
BLAKE2b-256	`44c31e7fea07d2e7afb9af7fa46d28a72cfaa3d26bdef0916a7786597d6a60a5`

See more details on using hashes here.

stt-listen 4.0.1b2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Listen: STT Services

Requirements

Installation

Usage

Start the server

Get authorization to listen

Use the client

Transcribe a file

Transcribe from a live microphone stream

Supported languages

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes