VAD-Enhanced ASR with Word-Level Timestamps

These details have not been verified by PyPI

Project links

Homepage

Project description

Praasper

Python GitHub License

Praasper is an Automatic Speech Recognition (ASR) framework designed to help researchers transribe audio files to word-level text with accurate transcriptoin and timestamps.

mechanism

In Praasper, we adopt a rather simple and straightforward pipeline to extract phoneme-level information from audio files. The pipeline includes Whisper and Praditor.

Now Praasper supports Mandarin (zh). In the near future we plan to add support for Cantonese (yue) and English (en).

For langauges that are not yet support, you can still get a result as the word-level annotation with high external boundaries. While the inner boundries could be inaccurate due to Whisper's feature.

How to use

The default model is large-v3-turbo.

I personally recommend to use the SOTA model as time isn't a really big problem for offline processing.

Here is a simplest example:

import praasper

model = praasper.init_model(model_name="large-v3-turbo")  
model.annote(input_path="data")  # The folder where you store .wav

Here are all the parameters you can pass to the annote method:

model.annote(
    input_path="data",
    sr=12000,  # I use 12000 as default. sr=None will use audio's original sample rate
    language=None,  # "zh" for Mandarin, "yue" for Cantonese, "en" for English, None for automatic language detection
    seg_dur=15.,  # Segment large audio into pieces, 15 seconds as default.
    merge_words=True,  # Merge adjacent words into a single interval
)

If you want to know what other models are available (but I suggest you use the largest anyway):

import whisper
print(whisper.available_models())

Mechanism

Whisper is used to transcribe the audio file to word-level text. At this point, speech onsets and offsets exhibit time deviations in seconds.

Praditor is applied to perform Voice Activity Detection (VAD) algorithm to trim the currently existing word/character-level timestamps to millisecond level. It is a Speech Onset Detection (SOT) algorithm we developed for langauge researchers.

The in-utterance word timestamps are first generated from Whisper's results (i.e., word_timestamps=True) and then recalibrated using neighboring acoustic cues, including drifted frequency peak, power valley, and intensity valley.

Setup

pip installation

pip install -U praasper

If you have a succesful installation and don't care if there is GPU accelaration, you can stop it right here.

GPU Acceleration (Windows/Linux)

Whisper can automaticly detects the best currently available device to use. But you still need to first install GPU-support version torch in order to enable CUDA acceleration.

For macOS users, Whisper only supports CPU as the processing device.
For Windows/Linux users, the priority order should be: CUDA -> CPU.

If you have no experience in installing CUDA, follow the steps below:

First, go to command line and check the latest CUDA version your system supports:

nvidia-smi

Results should pop up like this (It means that this device supports CUDA up to version 12.9).

| NVIDIA-SMI 576.80                 Driver Version: 576.80         CUDA Version: 12.9     |

Next, go to NVIDIA CUDA Toolkit and download the latest version, or whichever version that fits your system/need.

Lastly, install torch that fits your CUDA version. Find the correct pip command in this link.

Here is an example for CUDA 12.9:

pip install --reinstall torch --index-url https://download.pytorch.org/whl/cu129

(Advanced) uv installation

uv is also highly recommended for way FASTER installation. First, make sure uv is installed to your default environment:

pip install uv

Then, create a virtual environment (e.g., .venv):

uv venv .venv

You should see a new .venv folder pops up in your project folder now. (You might also want to restart the terminal.)

Lastly, install praasper (by adding uv before pip):

uv pip install -U praasper

For CUDA support,

uv pip install --reinstall torch --index-url https://download.pytorch.org/whl/cu129
# Or whichever version that matches your CUDA version

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

0.6.0

Apr 28, 2026

0.5.6

Feb 11, 2026

0.5.5

Feb 10, 2026

0.5.4

Feb 10, 2026

0.5.3

Feb 10, 2026

0.5.2

Feb 10, 2026

0.5.1

Feb 10, 2026

0.5.0

Feb 10, 2026

0.4.6

Feb 10, 2026

0.4.5

Oct 16, 2025

0.4.4

Oct 12, 2025

This version

0.3.6

Oct 3, 2025

0.2.2

Sep 25, 2025

0.2.1

Sep 24, 2025

0.1.2

Sep 8, 2025

0.1.1

Sep 7, 2025

0.1.0

Sep 7, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

praasper-0.3.6.tar.gz (28.0 kB view details)

Uploaded Oct 3, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

praasper-0.3.6-py3-none-any.whl (30.4 kB view details)

Uploaded Oct 3, 2025 Python 3

File details

Details for the file praasper-0.3.6.tar.gz.

File metadata

Download URL: praasper-0.3.6.tar.gz
Upload date: Oct 3, 2025
Size: 28.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for praasper-0.3.6.tar.gz
Algorithm	Hash digest
SHA256	`2d411bd5edc52b506ee27fc7787ce35404410af208b514465a66ba527c53edc6`
MD5	`99769b3b3532665ec924c67e5e98fcd6`
BLAKE2b-256	`eb6c6972850cbf728430ca4e432a452960a70ebe7b5685ed1dfb4d6d7e23b8fa`

See more details on using hashes here.

File details

Details for the file praasper-0.3.6-py3-none-any.whl.

File metadata

Download URL: praasper-0.3.6-py3-none-any.whl
Upload date: Oct 3, 2025
Size: 30.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for praasper-0.3.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4762d5bcd95a70a1b85789e5c9907fccd4eeaa510cb228bd4baf0e278d859ee`
MD5	`cefbcb926fe1c04dbf4b1044c1eb16f4`
BLAKE2b-256	`0cea2fc13c22ab29c4fd3038ef595e8464d71a647d59b9c9ab6c483c50c6fc29`

See more details on using hashes here.

praasper 0.3.6

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Praasper

How to use

Mechanism

Setup

pip installation

GPU Acceleration (Windows/Linux)

(Advanced) uv installation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes