REST API for audio transcription using parakeet-mlx
Project description
Paratran
CLI, REST API, and MCP server for audio transcription on Apple Silicon, powered by parakeet-mlx.
The default model (parakeet-tdt-0.6b-v3) achieves 6.34% average WER across 8 English benchmarks and supports 25 languages. Runs ~30x faster than Whisper on Apple Silicon via MLX.
Requirements
- macOS with Apple Silicon (M1/M2/M3/M4)
- Python 3.11+
- ~2 GB memory for the default model
Quick Start
Transcribe audio files directly:
uvx paratran recording.wav
Or start the REST API server and transcribe via client mode (no model reload per file):
uvx paratran serve
uvx paratran -s http://localhost:8000 recording.wav
Install
uv (recommended)
uv tool install paratran
pip
pip install paratran
From source
git clone https://github.com/briansunter/paratran.git
cd paratran
uv sync
uv run paratran
CLI Usage
# Transcribe a single file
paratran recording.wav
# Transcribe multiple files with verbose output
paratran -v file1.wav file2.mp3 file3.m4a
# Output as SRT subtitles
paratran --output-format srt recording.wav
# Output all formats (txt, json, srt, vtt)
paratran --output-format all --output-dir ./output recording.wav
# Use beam search decoding
paratran --decoding beam recording.wav
# Custom model and cache directory
paratran --model mlx-community/parakeet-tdt-1.1b-v2 --cache-dir /Volumes/Storage/models recording.wav
Client Mode
Use --server / -s to send files to a running paratran server instead of transcribing locally. This avoids model loading time on every invocation — start the server once, then transcribe instantly.
# Start the server (loads model once)
paratran serve
# Transcribe via the server
paratran -s http://localhost:8000 recording.wav
# All the same options work
paratran -s http://localhost:8000 --output-format all --output-dir ./output -v recording.wav
# Set the server URL via environment variable
export PARATRAN_SERVER=http://localhost:8000
paratran recording.wav # automatically uses the server
CLI Options
| Flag | Default | Description |
|---|---|---|
-s, --server |
URL of a running paratran server | |
--model |
mlx-community/parakeet-tdt-0.6b-v3 |
HF model ID or local path |
--cache-dir |
HuggingFace default | Model cache directory |
--output-dir |
. |
Output directory |
--output-format |
txt |
txt, json, srt, vtt, or all |
--decoding |
greedy |
greedy or beam |
--chunk-duration |
120 |
Chunk duration in seconds (0 to disable) |
--overlap-duration |
15 |
Overlap between chunks |
--beam-size |
5 |
Beam size (beam decoding) |
--length-penalty |
0.013 |
Length penalty (beam decoding) |
--patience |
3.5 |
Patience (beam decoding) |
--duration-reward |
0.67 |
Duration reward (beam decoding) |
--max-words |
Max words per sentence | |
--silence-gap |
Split at silence gaps (seconds) | |
--max-duration |
Max sentence duration (seconds) | |
--fp32 |
Use FP32 precision instead of BF16 | |
-v |
Verbose output |
Environment variables: PARATRAN_MODEL, PARATRAN_MODEL_DIR, PARATRAN_SERVER.
REST API Server
# Start server with default settings
paratran serve
# Custom host, port, and model cache
paratran serve --host 127.0.0.1 --port 9000 --cache-dir /Volumes/Storage/models
API
GET /health
curl http://localhost:8000/health
{
"status": "ok",
"model": "mlx-community/parakeet-tdt-0.6b-v3",
"model_dir": "/Volumes/Storage/models"
}
POST /transcribe
Upload an audio file (wav, mp3, flac, m4a, ogg, webm):
curl -X POST http://localhost:8000/transcribe -F "file=@recording.m4a"
Optional query parameters:
| Parameter | Default | Description |
|---|---|---|
decoding |
greedy |
greedy or beam |
beam_size |
5 |
Beam size (beam decoding) |
length_penalty |
1.0 |
Length penalty (beam decoding) |
patience |
1.0 |
Patience (beam decoding) |
duration_reward |
0.7 |
Duration reward (beam decoding) |
max_words |
Max words per sentence | |
silence_gap |
Split at silence gaps (seconds) | |
max_duration |
Max sentence duration (seconds) | |
chunk_duration |
Chunk duration for long audio (seconds) | |
overlap_duration |
15.0 |
Overlap between chunks (seconds) |
fp32 |
false |
Use FP32 instead of BF16 |
{
"text": "Hello world, this is a test.",
"duration": 3.52,
"processing_time": 0.176,
"sentences": [
{
"text": "Hello world, this is a test.",
"start": 0.0,
"end": 3.52,
"tokens": [
{ "text": "Hello", "start": 0.0, "end": 0.48 },
{ "text": " world", "start": 0.48, "end": 0.8 }
]
}
]
}
Interactive API docs are available at http://localhost:8000/docs.
MCP Server
Paratran includes an MCP server so Claude Code, Claude Desktop, or any MCP client can transcribe audio files directly. Supports both stdio and streamable HTTP transports.
Claude Code (stdio)
Add to .claude/settings.json:
{
"mcpServers": {
"paratran": {
"command": "uvx",
"args": ["--from", "paratran", "paratran-mcp"]
}
}
}
Claude Desktop (stdio)
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"paratran": {
"command": "uvx",
"args": ["--from", "paratran", "paratran-mcp"]
}
}
}
Optionally set PARATRAN_MODEL_DIR in the env block to customize the model cache location.
Streamable HTTP
Run the MCP server over HTTP for remote or multi-client access:
paratran-mcp --transport streamable-http --host 0.0.0.0 --port 8000
The MCP endpoint is available at http://localhost:8000/mcp.
MCP Tool
The transcribe tool accepts a file path and all the same options as the REST API (decoding, beam search, sentence splitting, chunking, precision).
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file paratran-0.5.0.tar.gz.
File metadata
- Download URL: paratran-0.5.0.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8de12d10625290f00b2bb7959656a48264366c8ba3b542b8823583814adab731
|
|
| MD5 |
1622029dcfbf21cfcbb83869ba7780a4
|
|
| BLAKE2b-256 |
9f20458017c942328bc1b10e805cb0955e42e63ca40fda51c9a765911b4e9cc3
|
File details
Details for the file paratran-0.5.0-py3-none-any.whl.
File metadata
- Download URL: paratran-0.5.0-py3-none-any.whl
- Upload date:
- Size: 11.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.2 {"installer":{"name":"uv","version":"0.10.2","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
725a1ffe4a62bf594c007a6723f6c6f9e6a56639093b9f66b4a8cbcf641542df
|
|
| MD5 |
cb07ca50fdf994f12fbc328e8c02bfb2
|
|
| BLAKE2b-256 |
d5860c0225968a1254ba5e34e18bc49df15f777b3ff61d729ec232dc0f6585a7
|