Skip to main content

Local MCP server for MLX Whisper transcription

Project description

MLX Whisper MCP Server

A simple Model Context Protocol (MCP) server that provides audio transcription capabilities using MLX Whisper on Apple Silicon Macs.

Features

  • Transcribe audio files directly from disk
  • Transcribe audio from base64-encoded data
  • Download and transcribe YouTube videos
  • Uses the high-quality mlx-community/whisper-large-v3-turbo model
  • Self-contained script with automatic dependency management via uv run
  • Rich console output for easy debugging
  • Saves transcription text files alongside audio files

Requirements

  • Python 3.12 or higher
  • Apple Silicon Mac (M-series)
  • uv installed (pip install uv or curl -sS https://astral.sh/uv/install.sh | bash)

Quick Start

Run directly with uv run:

uv run mlx_whisper_mcp.py

That's it! The script will automatically install its own dependencies and start the MCP server. Note: The first time you run the script, it may take longer to start as it will download the Whisper model (approx. 1.6GB). Subsequent runs will be faster.

Using with Claude Desktop

There are two main ways to integrate this server with Claude Desktop:

Option 1: Using uv (Recommended)

  1. Navigate to the directory where you've cloned or saved mlx_whisper_mcp.py.
  2. Run the following command:
    uv tool run fastmcp install mlx_whisper_mcp.py
    
  3. Restart Claude Desktop if it was running. fastmcp will have set up the necessary configuration to launch the server, including handling its dependencies via uv run.

Option 2: Manual Configuration

If you prefer to configure Claude Desktop manually:

  1. Edit your Claude Desktop configuration file:

    # On macOS:
    code ~/Library/Application\ Support/Claude/claude_desktop_config.json
    
    # On Windows:
    code %APPDATA%\Claude\claude_desktop_config.json
    
  2. Add the MLX Whisper MCP server configuration. Important: Replace /absolute/path/to/mlx_whisper_mcp/ in the cwd field below with the actual absolute path to the directory containing mlx_whisper_mcp.py on your system.

    {
      "mcpServers": {
        "mlx-whisper": {
          "command": "uv",
          "args": [
            "run",
            "mlx_whisper_mcp.py"
          ],
          "cwd": "/absolute/path/to/mlx_whisper_mcp/"
        }
      }
    }
    

    This configuration tells Claude Desktop to execute mlx_whisper_mcp.py using uv run, with the current working directory (cwd) set to the script's location. uv run will handle installing the dependencies defined for the script.

  3. Restart Claude Desktop.

Available Tools

The server provides the following tools:

1. transcribe_file

Transcribes an audio file from a path on disk.

Parameters:

  • file_path: Path to the audio file
  • language: (Optional) Language code to force a specific language
  • task: "transcribe" or "translate" (translates to English)

2. transcribe_audio

Transcribes audio from base64-encoded data.

Parameters:

  • audio_data: Base64-encoded audio data
  • language: (Optional) Language code to force a specific language
  • file_format: Audio file format (wav, mp3, etc.)
  • task: "transcribe" or "translate" (translates to English)

3. download_youtube

Downloads a YouTube video.

Parameters:

  • url: YouTube video URL
  • keep_file: If True, keeps the downloaded file (default: True)

4. transcribe_youtube

Downloads and transcribes a YouTube video.

Parameters:

  • url: YouTube video URL
  • language: (Optional) Language code to force a specific language
  • task: "transcribe" or "translate" (translates to English)
  • keep_file: If True, keeps the downloaded file (default: True)

Example Prompts for Claude Desktop

How It Works

This server uses the MCP Python SDK to expose MLX Whisper's transcription capabilities to clients like Claude. When a transcription is requested:

  1. The audio data is received (either as a file path, base64-encoded data, or YouTube URL)
  2. For YouTube URLs, the video is downloaded to ~/.mlx-whisper-mcp/downloads
  3. For base64 data, a temporary file is created
  4. MLX Whisper is used to perform the transcription
  5. The transcription text is saved to a .txt file alongside the audio file
  6. The transcription text is returned to the client
  7. Temporary files are cleaned up (unless keep_file=True)

Troubleshooting

  • Import Error: If you see an error about MLX Whisper not being found, make sure you're running on an Apple Silicon Mac
  • File Not Found: Make sure you're using absolute paths when referencing audio files
  • Memory Issues: Very long audio files may cause memory pressure with the large model
  • YouTube Download Errors: Some videos may be restricted or require authentication
  • JSON Errors: If you see "not valid JSON" errors in logs, make sure server logging output is properly directed to stderr

License

Apache License 2.0 See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iflow_mcp_kachio_whisper_mcp-0.1.0.tar.gz (9.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

iflow_mcp_kachio_whisper_mcp-0.1.0-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file iflow_mcp_kachio_whisper_mcp-0.1.0.tar.gz.

File metadata

  • Download URL: iflow_mcp_kachio_whisper_mcp-0.1.0.tar.gz
  • Upload date:
  • Size: 9.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_kachio_whisper_mcp-0.1.0.tar.gz
Algorithm Hash digest
SHA256 372a1fdcf92d500b0f8ed08dffb6764f95fd27bd3d8911ee510534052baccc44
MD5 67048013950746a80e2bb44fe1a363c3
BLAKE2b-256 ed49966364e887b1673c51e3ce7a46833af6a3eb5cdd48f5ce060a9b4afecb06

See more details on using hashes here.

File details

Details for the file iflow_mcp_kachio_whisper_mcp-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: iflow_mcp_kachio_whisper_mcp-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.0 {"installer":{"name":"uv","version":"0.10.0","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Debian GNU/Linux","version":"13","id":"trixie","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for iflow_mcp_kachio_whisper_mcp-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d60828b69d5b1c1ef7ec635eca11d395805d207b647c52603cada11a62a76565
MD5 2439e91032556274eae2cbf042e80053
BLAKE2b-256 5558e15c04e23a5a29ad9f3d959556852073af9b5a93387fad05f1c404df4996

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page