MLX Whisper transcription server for Apple Silicon with WTF/vCon support
Project description
vcon-mac-wtf
MLX Whisper transcription server for Apple Silicon with WTF/vCon support.
Runs OpenAI's Whisper speech-to-text model locally on Apple Silicon (M1/M2/M3/M4) using the MLX framework for GPU-accelerated inference. Outputs transcriptions in World Transcription Format (WTF) and can enrich vCon documents with transcription analysis.
Features
- GPU-accelerated Whisper inference on Apple Silicon via MLX
- OpenAI-compatible API (
POST /v1/audio/transcriptions) — drop-in replacement - vCon-native API (
POST /transcribe) — accepts a vCon, returns enriched vCon with WTF transcription - Word-level timestamps
- Multiple model sizes (tiny through large-v3)
- Integrates with wtf-server as the
mlx-whisperprovider
Prerequisites
- Apple Silicon Mac (M1/M2/M3/M4)
- Python 3.12+
- uv (recommended) or pip
- ffmpeg:
brew install ffmpeg
Quickstart
# Install (choose one)
pip install vcon-mac-wtf # PyPI
brew install your-username/vcon/vcon-mac-wtf # Homebrew tap
# Or from source
uv sync --all-extras
# Start the server (downloads model on first run)
vcon-mac-wtf
# or: make run
# or: uv run uvicorn vcon_mac_wtf.main:app --host 0.0.0.0 --port 8000
API Endpoints
Health
curl http://localhost:8000/health
curl http://localhost:8000/health/ready
Transcribe Audio (OpenAI-compatible)
curl -X POST http://localhost:8000/v1/audio/transcriptions \
-F "file=@recording.wav" \
-F "model=mlx-community/whisper-turbo" \
-F "response_format=verbose_json"
Response formats: json, text, verbose_json, wtf
Transcribe vCon
curl -X POST http://localhost:8000/transcribe \
-H "Content-Type: application/json" \
-d @my_vcon.json
Returns the vCon with WTF transcription analysis appended.
List Models
curl http://localhost:8000/v1/models
Configuration
| Variable | Default | Description |
|---|---|---|
HOST |
0.0.0.0 |
Bind address |
PORT |
8000 |
Server port |
LOG_LEVEL |
info |
Logging level |
MLX_MODEL |
mlx-community/whisper-turbo |
Default Whisper model |
PRELOAD_MODEL |
true |
Load model at startup |
MAX_AUDIO_SIZE_MB |
100 |
Max upload size |
HF_TOKEN |
- | HuggingFace token for faster model downloads (optional) |
Copy .env.example to .env to customize. Add HF_TOKEN=hf_xxx to .env before first run to speed up model downloads and avoid rate limits.
Available Models
| Short Name | Model ID | Parameters |
|---|---|---|
tiny |
mlx-community/whisper-tiny |
39M |
base |
mlx-community/whisper-base |
74M |
small |
mlx-community/whisper-small |
244M |
medium |
mlx-community/whisper-medium |
769M |
large-v3 |
mlx-community/whisper-large-v3 |
1.55B |
turbo |
mlx-community/whisper-turbo |
809M |
Integration with wtf-server
This server works as a provider for the existing TypeScript wtf-server. Set in the wtf-server .env:
ASR_PROVIDER=mlx-whisper
MLX_WHISPER_URL=http://localhost:8000
MLX_WHISPER_MODEL=mlx-community/whisper-turbo
Publishing
PyPI:
pip install build twine
python -m build && twine upload dist/*
Homebrew: See homebrew/README.md for tap setup. After publishing to PyPI, update the formula url and sha256, then add to your tap.
Development
make install # Install all deps including dev
make dev # Run with auto-reload
make test # Run unit tests
make test-all # Run all tests including integration
make lint # Lint with ruff
make format # Format with ruff
Testing
# Unit tests (no MLX required, uses mocks)
make test
# Integration tests (requires Apple Silicon + MLX)
make test-all
# Coverage
make test-cov
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vcon_mac_wtf-0.1.0.tar.gz.
File metadata
- Download URL: vcon_mac_wtf-0.1.0.tar.gz
- Upload date:
- Size: 136.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bcc069e1409e346534defae7725a5679e1fd92254427628dfe726e663d683382
|
|
| MD5 |
9b4e72ef21eece19cc7034610c596b78
|
|
| BLAKE2b-256 |
c709a366547b5cbce81c8c16670f04acc05a51f82adc9bfc2a886aa5c0f5035a
|
File details
Details for the file vcon_mac_wtf-0.1.0-py3-none-any.whl.
File metadata
- Download URL: vcon_mac_wtf-0.1.0-py3-none-any.whl
- Upload date:
- Size: 15.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4ee16d110e8a54aa9105bcf0643773004b2b0a32b6f1ac17d5e3039178667d4c
|
|
| MD5 |
6bcea856957d5d4cfbe64b94101dddbb
|
|
| BLAKE2b-256 |
0d1165b6ddef59a863b2a5b82e18449a49f13047500674e0d52805a616176b03
|