Skip to main content

MCP tool to transcribe audio/video files with whisper.cpp binary and generate SRT/VTT captions and OTIO timeline.

Project description

clipwright-transcribe

MCP tool to transcribe audio/video files and generate SRT/VTT captions and OTIO timeline.

External Binaries / Files

This tool requires the following external binaries/files to exist in the execution environment. They are not installed via pip, so obtain them separately.

whisper.cpp Binary

Used for transcription.

  • Place whisper-cli (or the binary name appropriate for your environment) on PATH, or specify the full path in the CLIPWRIGHT_WHISPER environment variable.
  • Obtain: Build from https://github.com/ggerganov/whisper.cpp, or use release binaries.
export CLIPWRIGHT_WHISPER=/path/to/whisper-cli

ggml Model File

Speech recognition model (.bin file) used by whisper.cpp.

  • Specify the full path to the model file in the CLIPWRIGHT_WHISPER_MODEL environment variable. Can be overridden by the model_path parameter at tool invocation.
  • Obtain: Download from https://huggingface.co/ggerganov/whisper.cpp etc.
export CLIPWRIGHT_WHISPER_MODEL=/path/to/ggml-base.bin

ffmpeg

Required to convert audio to 16kHz mono WAV (input format for whisper.cpp).

  • Place ffmpeg on PATH, or specify the full path in the CLIPWRIGHT_FFMPEG environment variable.
export CLIPWRIGHT_FFMPEG=/path/to/ffmpeg

Environment Variables Summary

Environment Variable Purpose Required
CLIPWRIGHT_WHISPER Path to whisper.cpp binary (required if not on PATH) Conditional
CLIPWRIGHT_WHISPER_MODEL Path to ggml model file (model_path parameter takes precedence) Conditional
CLIPWRIGHT_FFMPEG Path to ffmpeg binary (required if not on PATH) Conditional

GPU / CUDA Acceleration

clipwright-transcribe supports GPU-accelerated transcription transparently: simply point CLIPWRIGHT_WHISPER at a CUDA or Metal build of whisper.cpp — no code or parameter changes are required.

Obtaining a CUDA / Metal Binary

Platform How to obtain
Windows (CUDA) Download whisper-cublas-*-bin-x64.zip from whisper.cpp Releases. Extract and set CLIPWRIGHT_WHISPER to the full path of whisper-cli.exe.
Linux (CUDA) Build from source with -DGGML_CUDA=ON: cmake -B build -DGGML_CUDA=ON && cmake --build build -j --config Release. Binary is at build/bin/whisper-cli.
macOS (Metal) brew install whisper-cpp installs a Metal-accelerated build automatically.
# Windows CUDA example
export CLIPWRIGHT_WHISPER=/path/to/whisper-cublas/whisper-cli.exe

# macOS Metal example (after brew install whisper-cpp)
export CLIPWRIGHT_WHISPER=/opt/homebrew/bin/whisper-cli

Confirming GPU / Backend Usage

The tool envelope includes data.backend and data.realtime_factor so you can verify the device actually used at runtime:

{
  "data": {
    "backend": {
      "device": "cuda",
      "detail": "CUDA"
    },
    "realtime_factor": 12.5,
    "whisper_wall_seconds": 14.2
  }
}
  • data.backend.device — one of cuda, metal, cpu, or unknown.
  • data.backend.detail — sanitized fixed device label (CWE-209: no raw stderr / model path). Values: "CUDA" (cuda), "Metal" (metal), "cpu" (cpu), "" (unknown).
  • data.realtime_factoraudio_duration_sec / whisper_wall_seconds. Values above 1.0 mean faster than realtime (e.g. 12.5 means 12.5× faster than realtime); a GPU build typically yields values well above 1.0 while a slow CPU build may fall below 1.0.
  • data.whisper_wall_seconds — raw wall-clock seconds spent in the whisper subprocess.

summary also reports the backend used (e.g. " Backend: cuda (12.5x realtime)."), so the information is visible in the one-line MCP response without unpacking data.

Note on Python GPU Libraries

clipwright-transcribe does not import faster-whisper, CTranslate2, or any CUDA Python library. Transcription is always invoked as an external subprocess (CLIPWRIGHT_WHISPER), keeping GPU acceleration completely separate from the package install and preserving license independence. Any whisper-cli-compatible binary — CPU, CUDA, Metal, ROCm — can be used by updating the environment variable alone.

MCP Tool

clipwright_transcribe(media, output, options?) — Transcribe audio/video file and generate output.otio / output.srt / output.vtt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipwright_transcribe-0.5.0.tar.gz (18.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clipwright_transcribe-0.5.0-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file clipwright_transcribe-0.5.0.tar.gz.

File metadata

  • Download URL: clipwright_transcribe-0.5.0.tar.gz
  • Upload date:
  • Size: 18.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipwright_transcribe-0.5.0.tar.gz
Algorithm Hash digest
SHA256 377f517ff57995e35bd2c70da717ecdeb68636a642e69e9f19cdd76569f67fa6
MD5 19566f58db237cce5147513981bf19ae
BLAKE2b-256 7dfc8a128651296c940609246120c6263a32ab6adc590586df90d2207985cee6

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.5.0.tar.gz:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clipwright_transcribe-0.5.0-py3-none-any.whl.

File metadata

File hashes

Hashes for clipwright_transcribe-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c4ffa45d597242f82bf6ce1a02d224d5d70dbb747cea0eec57cddd34b5454e25
MD5 597a35fe93c3a4729024725cf896f7ea
BLAKE2b-256 4b615ed8b6405d8f4a3cf2f5afe60dc6910c0135869267949a7410b38be0dd24

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.5.0-py3-none-any.whl:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page