Fast local transcription for large lectures with NVIDIA Parakeet ONNX

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

brenorb

These details have not been verified by PyPI

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
Operating System
- MacOS
- POSIX :: Linux
Programming Language
- Python :: 3
- Rust
Topic
- Multimedia :: Sound/Audio :: Speech
- Utilities

Project description

fast-transcript

fast-transcript is a local lecture transcription CLI built to beat the usual Apple Silicon tradeoff: either fast but flaky, or accurate but painfully slow.

On the development machine, this project handled 30 minutes in 2!* while staying around 2.51 GB RSS on the long run. In the same local test set, it beat mlx-whisper, insanely-fast-whisper, and parakeet-mlx.

_{* Benchmark run on a MacBook Pro M1. The exact long-run measurement was 29m47s of Portuguese lecture audio transcribed in about 2m14s (13.38x real-time).}

The CLI binary is called fscript:

fscript lecture.mp3

That is the whole point of this project. One command. Large audio. No babysitting.

Why this exists

I wanted a tool for transcribing long classes and lectures quickly on a laptop while still using the computer for normal work.

The existing options I tested had clear problems for this use case:

insanely-fast-whisper was far too slow on this Mac once it fell back to CPU
mlx-whisper was solid, but slower than I wanted for long lecture workflows
parakeet-mlx had excellent memory numbers, but drifted into English on longer Portuguese segments unless heavily tuned

fast-transcript packages the ONNX Parakeet path that held up best in practice.

What it does

downloads the default Parakeet TDT 0.6B v3 int8 model automatically if it is missing
stores the extracted model in a persistent per-user application data directory
keeps the downloaded tarball in the user cache directory
accepts mp3, wav, and other audio formats supported by ffmpeg
accepts remote http(s) video/audio URLs supported by yt-dlp
prefers platform-provided manual subtitles for remote URLs when available
falls back to downloading remote audio and transcribing locally when only auto-captions exist or no captions exist
auto-converts unsupported audio to 16 kHz mono PCM16 WAV
uses 120s chunks with 2s overlap by default
writes <audio>.transcript.json next to the input unless you choose a different output path
stays quiet by default: concise progress in the terminal, transcript JSON on disk
shows a spinner and chunk progress bar on interactive terminals

Install

Requirements

ffmpeg
ffprobe
yt-dlp for remote URLs, or uvx yt-dlp

Install with Homebrew

brew install brenorb/fast-transcript/fast-transcript

On Apple Silicon macOS, Homebrew now installs fast-transcript from a proper bottle. On Linux x86_64, the formula still installs from the published release binary.

PyPI / uv

The PyPI package name for this project is fscript so the target UX is:

uvx fscript lecture.mp3
uv tool install fscript

The repo already includes platform wheel builds for:

macOS arm64
Linux x86_64

PyPI publishing is currently enabled for:

macOS arm64

See docs/pypi-publishing.md for the release workflow details.

Install a prebuilt binary directly

Download the archive for your platform from the GitHub Releases page, then put fscript on your PATH.

Build from source

cargo install --git https://github.com/brenorb/fast-transcript

Or from a local clone:

cargo install --path .

Quick start

fscript lecture.mp3
fscript https://www.youtube.com/watch?v=QSdh8Gj0mEg

This will:

ensure the default model exists
normalize the audio if needed
transcribe with the default chunking strategy
write lecture.transcript.json

For remote URLs, the default flow is:

inspect the URL with yt-dlp
use manual subtitles directly when the platform provides them
otherwise download the remote audio and run the normal local transcription pipeline

Usage

fscript <audio-or-url> [output.json]
fscript <audio-or-url> --stdout
fscript <audio-or-url> -
fscript --version

Optional overrides:

fscript lecture.wav custom-output.json
fscript lecture.wav --stdout
fscript lecture.wav --chunk-seconds 180 --chunk-overlap-seconds 3
fscript lecture.wav --chunk-seconds 0
fscript lecture.wav --model-dir ./models/parakeet/custom-copy
fscript lecture.wav --model-package ./models/parakeet-v3-int8.tar.gz
fscript lecture.wav --model-url https://example.com/parakeet-v3-int8.tar.gz
fscript https://www.youtube.com/watch?v=QSdh8Gj0mEg
fscript https://www.youtube.com/watch?v=QSdh8Gj0mEg --prefer-local-for-remote

Environment overrides:

FSCRIPT_MODEL_DIR
FSCRIPT_MODEL_PACKAGE
FSCRIPT_MODEL_URL

Defaults

model dir:
- macOS: ~/Library/Application Support/fast-transcript/models/parakeet-tdt-0.6b-v3-int8
- Linux: ~/.local/share/fast-transcript/models/parakeet-tdt-0.6b-v3-int8
model package cache:
- macOS: ~/Library/Caches/fast-transcript/parakeet-v3-int8.tar.gz
- Linux: ~/.cache/fast-transcript/parakeet-v3-int8.tar.gz
model URL: https://huggingface.co/brenorb/parakeet-tdt-0.6b-v3-int8-onnx-bundle/resolve/main/parakeet-v3-int8.tar.gz?download=1
chunk seconds: 120
chunk overlap seconds: 2
output path: <audio>.transcript.json

Benchmarks

These are local development benchmarks, not universal claims. They were run on the same Apple Silicon Mac used during development, using a Portuguese lecture clip and the same broader workflow comparison.

2-minute lecture clip

Engine	Setup	Speed	Peak RSS	Notes
fast-transcript	Parakeet ONNX	13.06x real-time	2.25 GB	Best balance of speed and reliability
`mlx-whisper`	`whisper-large-v3-turbo`	`5.25x`	`1.70 GB`	Good quality, slower
`parakeet-mlx`	tuned for quality	`4.92x`	`1.29 GB`	Needed substantial tuning
`parakeet-mlx`	raw greedy	`10.16x`	`0.57 GB`	Faster on short audio, drifted into English on longer PT-BR
`insanely-fast-whisper`	`whisper-large-v3` CPU	`0.30x`	`6.18 GB`	Accurate, but too slow here
`insanely-fast-whisper`	MPS + fallback	`0.31x`	`3.04 GB`	Small gain, same general problem

Long lecture run

Engine	Audio	Speed	Peak RSS	Notes
fast-transcript	`29m47s` lecture	13.38x real-time	2.51 GB	Stable long run with default chunking

Practical reading

fast-transcript was not the absolute fastest thing we saw in every synthetic case
it was the best result once long Portuguese lecture audio, transcript quality, and unattended runs all mattered at the same time
that is the target workload for this repo

Output format

The output is JSON and includes:

merged transcript text
model path
original input path
prepared WAV path
whether a remote URL used manual subtitles or the local model
whether ffmpeg normalization was used
load time
transcribe time
chunk configuration
per-chunk timing

Motivation

This project is optimized for large lectures and classes, including files in the 30-minute to 2-hour range, where:

startup friction matters
background CPU usage matters
memory spikes matter
brittle hand-tuned command lines become a tax

The design goal is not “highest benchmark on a cherry-picked GPU server”. The goal is “transcribe big local lecture audio fast enough that you actually keep using it”.

Inspiration

This project was heavily informed by:

In particular, the ONNX Parakeet path here was shaped by the packaging and implementation ideas used in Handy and GLaDOS.

Default model bundle

The default auto-download bundle is published in our own Hugging Face model repository:

brenorb/parakeet-tdt-0.6b-v3-int8-onnx-bundle

This keeps the default install path tied to the exact validated tarball instead of an app-specific blob host.

License

MIT

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

brenorb

These details have not been verified by PyPI

Development Status
- 4 - Beta
Environment
- Console
Intended Audience
- Developers
Operating System
- MacOS
- POSIX :: Linux
Programming Language
- Python :: 3
- Rust
Topic
- Multimedia :: Sound/Audio :: Speech
- Utilities

Release history Release notifications | RSS feed

This version

0.2.8

May 5, 2026

0.2.7

May 5, 2026

0.2.6

May 4, 2026

0.2.5

May 4, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fscript-0.2.8-py3-none-macosx_10_13_universal2.whl (9.7 MB view details)

Uploaded May 5, 2026 Python 3macOS 10.13+ universal2 (ARM64, x86-64)

File details

Details for the file fscript-0.2.8-py3-none-macosx_10_13_universal2.whl.

File metadata

Download URL: fscript-0.2.8-py3-none-macosx_10_13_universal2.whl
Upload date: May 5, 2026
Size: 9.7 MB
Tags: Python 3, macOS 10.13+ universal2 (ARM64, x86-64)
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for fscript-0.2.8-py3-none-macosx_10_13_universal2.whl
Algorithm	Hash digest
SHA256	`8fe421e8df26f8ddf0b01b72784763b7541fed72a062ac007a98d936a6d97e34`
MD5	`fbe354dc295e1da6f1c8760f7cf72fa6`
BLAKE2b-256	`fc2776cbf8bd201c9a450d4fe4acaf896c5abc5e7fcd7e40c3b0db0a86f275ee`

See more details on using hashes here.

Provenance

The following attestation bundles were made for fscript-0.2.8-py3-none-macosx_10_13_universal2.whl:

Publisher: release.yml on brenorb/fast-transcript

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: fscript-0.2.8-py3-none-macosx_10_13_universal2.whl
- Subject digest: 8fe421e8df26f8ddf0b01b72784763b7541fed72a062ac007a98d936a6d97e34
- Sigstore transparency entry: 1440048972
- Sigstore integration time: May 5, 2026
Source repository:
- Permalink: brenorb/fast-transcript@9af884ce91c86dd429d82f478c19925d5dc1453f
- Branch / Tag: refs/tags/v0.2.8
- Owner: https://github.com/brenorb
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@9af884ce91c86dd429d82f478c19925d5dc1453f
- Trigger Event: push

fscript 0.2.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

fast-transcript

Why this exists

What it does

Install

Requirements

Install with Homebrew

PyPI / uv

Install a prebuilt binary directly

Build from source

Quick start

Usage

Defaults

Benchmarks

2-minute lecture clip

Long lecture run

Practical reading

Output format

Motivation

Inspiration

Default model bundle

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

Provenance