Skip to main content

MCP tool to transcribe audio/video files with whisper.cpp binary and generate SRT/VTT captions and OTIO timeline.

Project description

clipwright-transcribe

MCP tool to transcribe audio/video files and generate SRT/VTT captions and OTIO timeline.

External Binaries / Files

This tool requires the following external binaries/files to exist in the execution environment. They are not installed via pip, so obtain them separately.

whisper.cpp Binary

Used for transcription.

  • Place whisper-cli (or the binary name appropriate for your environment) on PATH, or specify the full path in the CLIPWRIGHT_WHISPER environment variable.
  • Obtain: Build from https://github.com/ggerganov/whisper.cpp, or use release binaries.
export CLIPWRIGHT_WHISPER=/path/to/whisper-cli

ggml Model File

Speech recognition model (.bin file) used by whisper.cpp.

  • Specify the full path to the model file in the CLIPWRIGHT_WHISPER_MODEL environment variable. Can be overridden by the model_path parameter at tool invocation.
  • Obtain: Download from https://huggingface.co/ggerganov/whisper.cpp etc.
export CLIPWRIGHT_WHISPER_MODEL=/path/to/ggml-base.bin

ffmpeg

Required to convert audio to 16kHz mono WAV (input format for whisper.cpp).

  • Place ffmpeg on PATH, or specify the full path in the CLIPWRIGHT_FFMPEG environment variable.
export CLIPWRIGHT_FFMPEG=/path/to/ffmpeg

Environment Variables Summary

Environment Variable Purpose Required
CLIPWRIGHT_WHISPER Path to whisper.cpp binary (required if not on PATH) Conditional
CLIPWRIGHT_WHISPER_MODEL Path to ggml model file (model_path parameter takes precedence) Conditional
CLIPWRIGHT_FFMPEG Path to ffmpeg binary (required if not on PATH) Conditional

GPU / CUDA Acceleration

clipwright-transcribe supports GPU-accelerated transcription transparently: simply point CLIPWRIGHT_WHISPER at a CUDA or Metal build of whisper.cpp — no code or parameter changes are required.

Obtaining a CUDA / Metal Binary

Platform How to obtain
Windows (CUDA) Download whisper-cublas-*-bin-x64.zip from whisper.cpp Releases. Extract and set CLIPWRIGHT_WHISPER to the full path of whisper-cli.exe.
Linux (CUDA) Build from source with -DGGML_CUDA=ON: cmake -B build -DGGML_CUDA=ON && cmake --build build -j --config Release. Binary is at build/bin/whisper-cli.
macOS (Metal) brew install whisper-cpp installs a Metal-accelerated build automatically.
# Windows CUDA example
export CLIPWRIGHT_WHISPER=/path/to/whisper-cublas/whisper-cli.exe

# macOS Metal example (after brew install whisper-cpp)
export CLIPWRIGHT_WHISPER=/opt/homebrew/bin/whisper-cli

Confirming GPU / Backend Usage

The tool envelope includes data.backend and data.realtime_factor so you can verify the device actually used at runtime:

{
  "data": {
    "backend": {
      "device": "cuda",
      "detail": "CUDA"
    },
    "realtime_factor": 12.5,
    "whisper_wall_seconds": 14.2
  }
}
  • data.backend.device — one of cuda, metal, cpu, or unknown.
  • data.backend.detail — sanitized fixed device label (CWE-209: no raw stderr / model path). Values: "CUDA" (cuda), "Metal" (metal), "cpu" (cpu), "" (unknown).
  • data.realtime_factoraudio_duration_sec / whisper_wall_seconds. Values above 1.0 mean faster than realtime (e.g. 12.5 means 12.5× faster than realtime); a GPU build typically yields values well above 1.0 while a slow CPU build may fall below 1.0.
  • data.whisper_wall_seconds — raw wall-clock seconds spent in the whisper subprocess.

summary also reports the backend used (e.g. " Backend: cuda (12.5x realtime)."), so the information is visible in the one-line MCP response without unpacking data.

Note on Python GPU Libraries

clipwright-transcribe does not import faster-whisper, CTranslate2, or any CUDA Python library. Transcription is always invoked as an external subprocess (CLIPWRIGHT_WHISPER), keeping GPU acceleration completely separate from the package install and preserving license independence. Any whisper-cli-compatible binary — CPU, CUDA, Metal, ROCm — can be used by updating the environment variable alone.

MCP Tool

clipwright_transcribe(media, output, options?) — Transcribe audio/video file and generate output.otio / output.srt / output.vtt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clipwright_transcribe-0.5.1.tar.gz (18.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clipwright_transcribe-0.5.1-py3-none-any.whl (21.0 kB view details)

Uploaded Python 3

File details

Details for the file clipwright_transcribe-0.5.1.tar.gz.

File metadata

  • Download URL: clipwright_transcribe-0.5.1.tar.gz
  • Upload date:
  • Size: 18.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for clipwright_transcribe-0.5.1.tar.gz
Algorithm Hash digest
SHA256 2ca0efb4bf6e111b4e38f8ed44102406e0f84f8fc994c384188b8c181666130f
MD5 fa8edb8f894d025c87f8c991408784cf
BLAKE2b-256 2403901ebead013f2d76b8d37d14ba5534630ab68f7cc322e61e6e2df657ad0e

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.5.1.tar.gz:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file clipwright_transcribe-0.5.1-py3-none-any.whl.

File metadata

File hashes

Hashes for clipwright_transcribe-0.5.1-py3-none-any.whl
Algorithm Hash digest
SHA256 e9a6e52e6a6d2ce546f854a7dda7b44136dd4ae89901fd243547d3a1f65c8177
MD5 262a11c8b68a2f8c34367f8e7cf51ff5
BLAKE2b-256 ae7b6d7a6d6cca89e525d5c24cdb6658013ab0f279b361f3cd4ef4c34b8ab3ba

See more details on using hashes here.

Provenance

The following attestation bundles were made for clipwright_transcribe-0.5.1-py3-none-any.whl:

Publisher: publish.yml on satoh-y-0323/clipwright

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page