Skip to main content

MCP tool to transcribe audio/video files with whisper.cpp binary and generate SRT/VTT captions and OTIO timeline.

Project description

clipwright-transcribe

MCP tool to transcribe audio/video files and generate SRT/VTT captions and OTIO timeline.

External Binaries / Files

This tool requires the following external binaries/files to exist in the execution environment. They are not installed via pip, so obtain them separately.

whisper.cpp Binary

Used for transcription.

  • Place whisper-cli (or the binary name appropriate for your environment) on PATH, or specify the full path in the CLIPWRIGHT_WHISPER environment variable.
  • Obtain: Build from https://github.com/ggerganov/whisper.cpp, or use release binaries.
export CLIPWRIGHT_WHISPER=/path/to/whisper-cli

ggml Model File

Speech recognition model (.bin file) used by whisper.cpp.

  • Specify the full path to the model file in the CLIPWRIGHT_WHISPER_MODEL environment variable. Can be overridden by the model_path parameter at tool invocation.
  • Obtain: Download from https://huggingface.co/ggerganov/whisper.cpp etc.
export CLIPWRIGHT_WHISPER_MODEL=/path/to/ggml-base.bin

ffmpeg

Required to convert audio to 16kHz mono WAV (input format for whisper.cpp).

  • Place ffmpeg on PATH, or specify the full path in the CLIPWRIGHT_FFMPEG environment variable.
export CLIPWRIGHT_FFMPEG=/path/to/ffmpeg

Environment Variables Summary

Environment Variable Purpose Required
CLIPWRIGHT_WHISPER Path to whisper.cpp binary (required if not on PATH) Conditional
CLIPWRIGHT_WHISPER_MODEL Path to ggml model file (model_path parameter takes precedence) Conditional
CLIPWRIGHT_FFMPEG Path to ffmpeg binary (required if not on PATH) Conditional

GPU / CUDA Acceleration

clipwright-transcribe supports GPU-accelerated transcription transparently: simply point CLIPWRIGHT_WHISPER at a CUDA or Metal build of whisper.cpp — no code or parameter changes are required.

Obtaining a CUDA / Metal Binary

Platform How to obtain
Windows (CUDA) Download whisper-cublas-*-bin-x64.zip from whisper.cpp Releases. Extract and set CLIPWRIGHT_WHISPER to the full path of whisper-cli.exe.
Linux (CUDA) Build from source with -DGGML_CUDA=ON: cmake -B build -DGGML_CUDA=ON && cmake --build build -j --config Release. Binary is at build/bin/whisper-cli.
macOS (Metal) brew install whisper-cpp installs a Metal-accelerated build automatically.
# Windows CUDA example
export CLIPWRIGHT_WHISPER=/path/to/whisper-cublas/whisper-cli.exe

# macOS Metal example (after brew install whisper-cpp)
export CLIPWRIGHT_WHISPER=/opt/homebrew/bin/whisper-cli

Confirming GPU / Backend Usage

The tool envelope includes data.backend and data.realtime_factor so you can verify the device actually used at runtime:

{
  "data": {
    "backend": {
      "device": "cuda",
      "detail": "CUDA"
    },
    "realtime_factor": 12.5,
    "whisper_wall_seconds": 14.2
  }
}
  • data.backend.device — one of cuda, metal, cpu, or unknown.
  • data.backend.detail — sanitized fixed device label (CWE-209: no raw stderr / model path). Values: "CUDA" (cuda), "Metal" (metal), "cpu" (cpu), "" (unknown).
  • data.realtime_factoraudio_duration_sec / whisper_wall_seconds. Values above 1.0 mean faster than realtime (e.g. 12.5 means 12.5× faster than realtime); a GPU build typically yields values well above 1.0 while a slow CPU build may fall below 1.0.
  • data.whisper_wall_seconds — raw wall-clock seconds spent in the whisper subprocess.

summary also reports the backend used (e.g. " Backend: cuda (12.5x realtime)."), so the information is visible in the one-line MCP response without unpacking data.

Note on Python GPU Libraries

clipwright-transcribe does not import faster-whisper, CTranslate2, or any CUDA Python library. Transcription is always invoked as an external subprocess (CLIPWRIGHT_WHISPER), keeping GPU acceleration completely separate from the package install and preserving license independence. Any whisper-cli-compatible binary — CPU, CUDA, Metal, ROCm — can be used by updating the environment variable alone.

MCP Tool

clipwright_transcribe(media, output, options?) — Transcribe audio/video file and generate output.otio / output.srt / output.vtt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipwright_transcribe-0.3.1.tar.gz (14.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clipwright_transcribe-0.3.1-py3-none-any.whl (17.4 kB view details)

Uploaded Python 3

File details

Details for the file clipwright_transcribe-0.3.1.tar.gz.

File metadata

  • Download URL: clipwright_transcribe-0.3.1.tar.gz
  • Upload date:
  • Size: 14.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipwright_transcribe-0.3.1.tar.gz
Algorithm Hash digest
SHA256 712f461c2e6d26dd850b7a11be5db3d23ae5ea93dbf0371798b5011b20c9797f
MD5 219d5993233bc6923d8090a5654f127b
BLAKE2b-256 8c14a1787defc6fdff794df9275f59c9218ed7b18cd4ace7d0f57c18d1421076

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.3.1.tar.gz:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clipwright_transcribe-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for clipwright_transcribe-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ca725daffb6b9351eaf3d5f49c83376b2839edb679175d8b14bdff862bfe3aaa
MD5 4cb242437e20e743315874d57d1c00f5
BLAKE2b-256 30e1c8e62563281e028c47cc494268fa57930b3e6b644043b83bfd25666124e0

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.3.1-py3-none-any.whl:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page