Skip to main content

MLX Whisper transcription server for Apple Silicon with WTF/vCon support

Project description

vcon-mac-wtf

MLX Whisper transcription server for Apple Silicon with WTF/vCon support.

Runs OpenAI's Whisper speech-to-text model locally on Apple Silicon (M1/M2/M3/M4) using the MLX framework for GPU-accelerated inference. Outputs transcriptions in World Transcription Format (WTF) and can enrich vCon documents with transcription analysis.

Features

  • GPU-accelerated Whisper inference on Apple Silicon via MLX
  • OpenAI-compatible API (POST /v1/audio/transcriptions) — drop-in replacement
  • vCon-native API (POST /transcribe) — accepts a vCon, returns enriched vCon with WTF transcription
  • Word-level timestamps
  • Multiple model sizes (tiny through large-v3)
  • Integrates with wtf-server as the mlx-whisper provider

Prerequisites

  • Apple Silicon Mac (M1/M2/M3/M4)
  • Python 3.12+
  • uv (recommended) or pip
  • ffmpeg: brew install ffmpeg

Quickstart

# Install (choose one)
pip install vcon-mac-wtf          # PyPI
brew install your-username/vcon/vcon-mac-wtf  # Homebrew tap

# Or from source
uv sync --all-extras

# Start the server (downloads model on first run)
vcon-mac-wtf
# or: make run
# or: uv run uvicorn vcon_mac_wtf.main:app --host 0.0.0.0 --port 8000

API Endpoints

Health

curl http://localhost:8000/health
curl http://localhost:8000/health/ready

Transcribe Audio (OpenAI-compatible)

curl -X POST http://localhost:8000/v1/audio/transcriptions \
  -F "file=@recording.wav" \
  -F "model=mlx-community/whisper-turbo" \
  -F "response_format=verbose_json"

Response formats: json, text, verbose_json, wtf

Transcribe vCon

curl -X POST http://localhost:8000/transcribe \
  -H "Content-Type: application/json" \
  -d @my_vcon.json

Returns the vCon with WTF transcription analysis appended.

List Models

curl http://localhost:8000/v1/models

Configuration

Variable Default Description
HOST 0.0.0.0 Bind address
PORT 8000 Server port
LOG_LEVEL info Logging level
MLX_MODEL mlx-community/whisper-turbo Default Whisper model
PRELOAD_MODEL true Load model at startup
MAX_AUDIO_SIZE_MB 100 Max upload size
HF_TOKEN - HuggingFace token for faster model downloads (optional)

Copy .env.example to .env to customize. Add HF_TOKEN=hf_xxx to .env before first run to speed up model downloads and avoid rate limits.

Available Models

Short Name Model ID Parameters
tiny mlx-community/whisper-tiny 39M
base mlx-community/whisper-base 74M
small mlx-community/whisper-small 244M
medium mlx-community/whisper-medium 769M
large-v3 mlx-community/whisper-large-v3 1.55B
turbo mlx-community/whisper-turbo 809M

Integration with wtf-server

This server works as a provider for the existing TypeScript wtf-server. Set in the wtf-server .env:

ASR_PROVIDER=mlx-whisper
MLX_WHISPER_URL=http://localhost:8000
MLX_WHISPER_MODEL=mlx-community/whisper-turbo

Publishing

PyPI:

pip install build twine
python -m build && twine upload dist/*

Homebrew: See homebrew/README.md for tap setup. After publishing to PyPI, update the formula url and sha256, then add to your tap.

Development

make install    # Install all deps including dev
make dev        # Run with auto-reload
make test       # Run unit tests
make test-all   # Run all tests including integration
make lint       # Lint with ruff
make format     # Format with ruff

Testing

# Unit tests (no MLX required, uses mocks)
make test

# Integration tests (requires Apple Silicon + MLX)
make test-all

# Coverage
make test-cov

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcon_mac_wtf-0.1.0.tar.gz (136.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcon_mac_wtf-0.1.0-py3-none-any.whl (15.7 kB view details)

Uploaded Python 3

File details

Details for the file vcon_mac_wtf-0.1.0.tar.gz.

File metadata

  • Download URL: vcon_mac_wtf-0.1.0.tar.gz
  • Upload date:
  • Size: 136.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for vcon_mac_wtf-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bcc069e1409e346534defae7725a5679e1fd92254427628dfe726e663d683382
MD5 9b4e72ef21eece19cc7034610c596b78
BLAKE2b-256 c709a366547b5cbce81c8c16670f04acc05a51f82adc9bfc2a886aa5c0f5035a

See more details on using hashes here.

File details

Details for the file vcon_mac_wtf-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vcon_mac_wtf-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 15.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for vcon_mac_wtf-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4ee16d110e8a54aa9105bcf0643773004b2b0a32b6f1ac17d5e3039178667d4c
MD5 6bcea856957d5d4cfbe64b94101dddbb
BLAKE2b-256 0d1165b6ddef59a863b2a5b82e18449a49f13047500674e0d52805a616176b03

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page