Skip to main content

MCP tool to transcribe audio/video files with whisper.cpp binary and generate SRT/VTT captions and OTIO timeline.

Project description

clipwright-transcribe

MCP tool to transcribe audio/video files and generate SRT/VTT captions and OTIO timeline.

External Binaries / Files

This tool requires the following external binaries/files to exist in the execution environment. They are not installed via pip, so obtain them separately.

whisper.cpp Binary

Used for transcription.

  • Place whisper-cli (or the binary name appropriate for your environment) on PATH, or specify the full path in the CLIPWRIGHT_WHISPER environment variable.
  • Obtain: Build from https://github.com/ggerganov/whisper.cpp, or use release binaries.
export CLIPWRIGHT_WHISPER=/path/to/whisper-cli

ggml Model File

Speech recognition model (.bin file) used by whisper.cpp.

  • Specify the full path to the model file in the CLIPWRIGHT_WHISPER_MODEL environment variable. Can be overridden by the model_path parameter at tool invocation.
  • Obtain: Download from https://huggingface.co/ggerganov/whisper.cpp etc.
export CLIPWRIGHT_WHISPER_MODEL=/path/to/ggml-base.bin

ffmpeg

Required to convert audio to 16kHz mono WAV (input format for whisper.cpp).

  • Place ffmpeg on PATH, or specify the full path in the CLIPWRIGHT_FFMPEG environment variable.
export CLIPWRIGHT_FFMPEG=/path/to/ffmpeg

Environment Variables Summary

Environment Variable Purpose Required
CLIPWRIGHT_WHISPER Path to whisper.cpp binary (required if not on PATH) Conditional
CLIPWRIGHT_WHISPER_MODEL Path to ggml model file (model_path parameter takes precedence) Conditional
CLIPWRIGHT_FFMPEG Path to ffmpeg binary (required if not on PATH) Conditional

GPU / CUDA Acceleration

clipwright-transcribe supports GPU-accelerated transcription transparently: simply point CLIPWRIGHT_WHISPER at a CUDA or Metal build of whisper.cpp — no code or parameter changes are required.

Obtaining a CUDA / Metal Binary

Platform How to obtain
Windows (CUDA) Download whisper-cublas-*-bin-x64.zip from whisper.cpp Releases. Extract and set CLIPWRIGHT_WHISPER to the full path of whisper-cli.exe.
Linux (CUDA) Build from source with -DGGML_CUDA=ON: cmake -B build -DGGML_CUDA=ON && cmake --build build -j --config Release. Binary is at build/bin/whisper-cli.
macOS (Metal) brew install whisper-cpp installs a Metal-accelerated build automatically.
# Windows CUDA example
export CLIPWRIGHT_WHISPER=/path/to/whisper-cublas/whisper-cli.exe

# macOS Metal example (after brew install whisper-cpp)
export CLIPWRIGHT_WHISPER=/opt/homebrew/bin/whisper-cli

Confirming GPU / Backend Usage

The tool envelope includes data.backend and data.realtime_factor so you can verify the device actually used at runtime:

{
  "data": {
    "backend": {
      "device": "cuda",
      "detail": "CUDA"
    },
    "realtime_factor": 12.5,
    "whisper_wall_seconds": 14.2
  }
}
  • data.backend.device — one of cuda, metal, cpu, or unknown.
  • data.backend.detail — sanitized fixed device label (CWE-209: no raw stderr / model path). Values: "CUDA" (cuda), "Metal" (metal), "cpu" (cpu), "" (unknown).
  • data.realtime_factoraudio_duration_sec / whisper_wall_seconds. Values above 1.0 mean faster than realtime (e.g. 12.5 means 12.5× faster than realtime); a GPU build typically yields values well above 1.0 while a slow CPU build may fall below 1.0.
  • data.whisper_wall_seconds — raw wall-clock seconds spent in the whisper subprocess.

summary also reports the backend used (e.g. " Backend: cuda (12.5x realtime)."), so the information is visible in the one-line MCP response without unpacking data.

Note on Python GPU Libraries

clipwright-transcribe does not import faster-whisper, CTranslate2, or any CUDA Python library. Transcription is always invoked as an external subprocess (CLIPWRIGHT_WHISPER), keeping GPU acceleration completely separate from the package install and preserving license independence. Any whisper-cli-compatible binary — CPU, CUDA, Metal, ROCm — can be used by updating the environment variable alone.

MCP Tool

clipwright_transcribe(media, output, options?) — Transcribe audio/video file and generate output.otio / output.srt / output.vtt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipwright_transcribe-0.3.0.tar.gz (14.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clipwright_transcribe-0.3.0-py3-none-any.whl (17.2 kB view details)

Uploaded Python 3

File details

Details for the file clipwright_transcribe-0.3.0.tar.gz.

File metadata

  • Download URL: clipwright_transcribe-0.3.0.tar.gz
  • Upload date:
  • Size: 14.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipwright_transcribe-0.3.0.tar.gz
Algorithm Hash digest
SHA256 3fd2b2e8b9382f22ac10821c5fa5dc0574c8de4922b2f57b88306795f6a3cb74
MD5 7608ab443cd870e188aa6f4f3ac8a26b
BLAKE2b-256 67b7fdb5a7a0e57ec81c2d8dc5eddd0f629073fd97facbcefacd4af1aa36d325

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.3.0.tar.gz:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clipwright_transcribe-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for clipwright_transcribe-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f033032ca15122d436013ea97036b2273ec24f63690dc7981404e0ad29cfb2a3
MD5 6e7c13d63d20a2d84af7c28364d51c6e
BLAKE2b-256 b2fb7f025525204c179e302b63a5f20639b70ff80f6476029b52afa1641bef13

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.3.0-py3-none-any.whl:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page