An implementation of the Nvidia's Parakeet models for Apple Silicon using MLX.

These details have not been verified by PyPI

Project links

Project description

Parakeet MLX

An implementation of the Parakeet models - Nvidia's ASR(Automatic Speech Recognition) models - for Apple Silicon using MLX.

Installation

[!NOTE] Make sure you have ffmpeg installed on your system first, otherwise CLI won't work properly.

Using uv - recommended way:

uv add parakeet-mlx -U

Or, for the CLI:

uv tool install parakeet-mlx -U

Using pip:

pip install parakeet-mlx -U

CLI Quick Start

parakeet-mlx <audio_files> [OPTIONS]

Arguments

audio_files: One or more audio files to transcribe (WAV, MP3, etc.)

Options

--model (default: mlx-community/parakeet-tdt-0.6b-v2)
- Hugging Face repository of the model to use
--output-dir (default: current directory)
- Directory to save transcription outputs
--output-format (default: srt)
- Output format (txt/srt/vtt/json/all)
--output-template (default: {filename})
- Template for output filenames, {filename}, {index}, {date} is supported.
--highlight-words (default: False)
- Enable word-level timestamps in SRT/VTT outputs
--verbose / -v (default: False)
- Print detailed progress information
--chunk-duration (default: 120 seconds)
- Chunking duration in seconds for long audio, 0 to disable chunking
--overlap-duration (default: 15 seconds)
- Overlap duration in seconds if using chunking
--fp32 / --bf16 (default: bf16)
- Determinate the precision to use

Examples

# Basic transcription
parakeet-mlx audio.mp3

# Multiple files with word-level timestamps of VTT subtitle
parakeet-mlx *.mp3 --output-format vtt --highlight-words

# Generate all output formats
parakeet-mlx audio.mp3 --output-format all

Python API Quick Start

Transcribe a file:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v2")

result = model.transcribe("audio_file.wav")

print(result.text)

Check timestamps:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v2")

result = model.transcribe("audio_file.wav")

print(result.sentences)
# [AlignedSentence(text="Hello World.", start=1.01, end=2.04, duration=1.03, tokens=[...])]

Do chunking:

from parakeet_mlx import from_pretrained

model = from_pretrained("mlx-community/parakeet-tdt-0.6b-v2")

result = model.transcribe("audio_file.wav", chunk_duration=60 * 2.0, overlap_duration=15.0)

print(result.sentences)

Timestamp Result

AlignedResult: Top-level result containing the full text and sentences
- text: Full transcribed text
- sentences: List of AlignedSentence
AlignedSentence: Sentence-level alignments with start/end times
- text: Sentence text
- start: Start time in seconds
- end: End time in seconds
- duration: Between start and end.
- tokens: List of AlignedToken
AlignedToken: Word/token-level alignments with precise timestamps
- text: Token text
- start: Start time in seconds
- end: End time in seconds
- duration: Between start and end.

Low-Level API

To transcribe log-mel spectrum directly, you can do the following:

import mlx.core as mx
from parakeet_mlx.audio import get_logmel, load_audio

# Load and preprocess audio manually
audio = load_audio("audio.wav", model.preprocessor_config.sample_rate)
mel = get_logmel(audio, model.preprocessor_config)

# Generate transcription with alignments
# Accepts both [batch, sequence, feat] and [sequence, feat]
# `alignments` is list of AlignedResult. (no matter you fed batch dimension or not!)
alignments = model.generate(mel)

Todo

Add CLI for better usability
Add support for other Parakeet varients
Streaming input (Although RTFx is MUCH higher than 1 currently - it should be much sufficient to stream with current state)
Option to enhance choosen words' accuracy
Chunking with continuous context (I think it might be able to achieve by preserving decoder state - just a speculation though)

Acknowledgments

Thanks to Nvidia for training this awesome models and writing cool papers and providing nice implementation.
Thanks to MLX project for providing the framework that made this implementation possible.
Thanks to audiofile and audresample, numpy, librosa for audio processing.
Thanks to dacite for config management.

License

Apache 2.0

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.1

Feb 21, 2026

0.5.0

Jan 7, 2026

0.4.1

Nov 20, 2025

0.4.0

Oct 5, 2025

0.3.7

Sep 17, 2025

0.3.6

Aug 25, 2025

0.3.5

Jul 13, 2025

0.3.4

Jul 13, 2025

0.3.3

Jul 1, 2025

0.3.2

Jul 1, 2025

0.3.1

Jun 20, 2025

0.3.0

May 26, 2025

0.2.7

May 15, 2025

0.2.6

May 10, 2025

This version

0.2.5

May 8, 2025

0.2.4

May 7, 2025

0.2.3

May 7, 2025

0.2.2

May 6, 2025

0.2.1

May 6, 2025

0.2.0

May 6, 2025

0.1.5

May 6, 2025

0.1.4

May 6, 2025

0.1.3

May 5, 2025

0.1.2

May 4, 2025

0.1.1

May 4, 2025

0.1.0

May 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parakeet_mlx-0.2.5.tar.gz (23.1 kB view details)

Uploaded May 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parakeet_mlx-0.2.5-py3-none-any.whl (24.3 kB view details)

Uploaded May 8, 2025 Python 3

File details

Details for the file parakeet_mlx-0.2.5.tar.gz.

File metadata

Download URL: parakeet_mlx-0.2.5.tar.gz
Upload date: May 8, 2025
Size: 23.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.8

File hashes

Hashes for parakeet_mlx-0.2.5.tar.gz
Algorithm	Hash digest
SHA256	`b8566635e2e5acac2794205daf7e1d3b02acf7f40e46e76859ece248aee32932`
MD5	`3c1fecc030b3f9a9c11e5c5ed5e1b7ce`
BLAKE2b-256	`63ca22727cf62dfe774171c9d11e081d8e05e91869f77f98dbb8d8fa9e92f55e`

See more details on using hashes here.

File details

Details for the file parakeet_mlx-0.2.5-py3-none-any.whl.

File metadata

Download URL: parakeet_mlx-0.2.5-py3-none-any.whl
Upload date: May 8, 2025
Size: 24.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.8

File hashes

Hashes for parakeet_mlx-0.2.5-py3-none-any.whl
Algorithm	Hash digest
SHA256	`53299946807471a777b811f62d8a6377b793d350373104d08478c05f3c5b6880`
MD5	`22141f32498323514ec51e343a37003a`
BLAKE2b-256	`c1828edecf6cc8ea8a961612d36c06191ce60e219eb698d4de92796e95434dd5`

See more details on using hashes here.

parakeet-mlx 0.2.5

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Parakeet MLX

Installation

CLI Quick Start

Arguments

Options

Examples

Python API Quick Start

Timestamp Result

Low-Level API

Todo

Acknowledgments

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes