MCP tool to transcribe audio/video files with whisper.cpp binary and generate SRT/VTT captions and OTIO timeline.
Project description
clipwright-transcribe
MCP tool to transcribe audio/video files and generate SRT/VTT captions and OTIO timeline.
External Binaries / Files
This tool requires the following external binaries/files to exist in the execution environment. They are not installed via pip, so obtain them separately.
whisper.cpp Binary
Used for transcription.
- Place
whisper-cli(or the binary name appropriate for your environment) on PATH, or specify the full path in theCLIPWRIGHT_WHISPERenvironment variable. - Obtain: Build from https://github.com/ggerganov/whisper.cpp, or use release binaries.
export CLIPWRIGHT_WHISPER=/path/to/whisper-cli
ggml Model File
Speech recognition model (.bin file) used by whisper.cpp.
- Specify the full path to the model file in the
CLIPWRIGHT_WHISPER_MODELenvironment variable. Can be overridden by themodel_pathparameter at tool invocation. - Obtain: Download from https://huggingface.co/ggerganov/whisper.cpp etc.
export CLIPWRIGHT_WHISPER_MODEL=/path/to/ggml-base.bin
ffmpeg
Required to convert audio to 16kHz mono WAV (input format for whisper.cpp).
- Place
ffmpegon PATH, or specify the full path in theCLIPWRIGHT_FFMPEGenvironment variable.
export CLIPWRIGHT_FFMPEG=/path/to/ffmpeg
Environment Variables Summary
| Environment Variable | Purpose | Required |
|---|---|---|
CLIPWRIGHT_WHISPER |
Path to whisper.cpp binary (required if not on PATH) | Conditional |
CLIPWRIGHT_WHISPER_MODEL |
Path to ggml model file (model_path parameter takes precedence) |
Conditional |
CLIPWRIGHT_FFMPEG |
Path to ffmpeg binary (required if not on PATH) | Conditional |
GPU / CUDA Acceleration
clipwright-transcribe supports GPU-accelerated transcription transparently: simply point
CLIPWRIGHT_WHISPER at a CUDA or Metal build of whisper.cpp — no code or parameter changes
are required.
Obtaining a CUDA / Metal Binary
| Platform | How to obtain |
|---|---|
| Windows (CUDA) | Download whisper-cublas-*-bin-x64.zip from whisper.cpp Releases. Extract and set CLIPWRIGHT_WHISPER to the full path of whisper-cli.exe. |
| Linux (CUDA) | Build from source with -DGGML_CUDA=ON: cmake -B build -DGGML_CUDA=ON && cmake --build build -j --config Release. Binary is at build/bin/whisper-cli. |
| macOS (Metal) | brew install whisper-cpp installs a Metal-accelerated build automatically. |
# Windows CUDA example
export CLIPWRIGHT_WHISPER=/path/to/whisper-cublas/whisper-cli.exe
# macOS Metal example (after brew install whisper-cpp)
export CLIPWRIGHT_WHISPER=/opt/homebrew/bin/whisper-cli
Confirming GPU / Backend Usage
The tool envelope includes data.backend and data.realtime_factor so you can verify the
device actually used at runtime:
{
"data": {
"backend": {
"device": "cuda",
"detail": "CUDA"
},
"realtime_factor": 12.5,
"whisper_wall_seconds": 14.2
}
}
data.backend.device— one ofcuda,metal,cpu, orunknown.data.backend.detail— sanitized fixed device label (CWE-209: no raw stderr / model path). Values:"CUDA"(cuda),"Metal"(metal),"cpu"(cpu),""(unknown).data.realtime_factor—audio_duration_sec / whisper_wall_seconds. Values above1.0mean faster than realtime (e.g.12.5means 12.5× faster than realtime); a GPU build typically yields values well above1.0while a slow CPU build may fall below1.0.data.whisper_wall_seconds— raw wall-clock seconds spent in the whisper subprocess.
summary also reports the backend used (e.g. " Backend: cuda (12.5x realtime)."), so
the information is visible in the one-line MCP response without unpacking data.
Note on Python GPU Libraries
clipwright-transcribe does not import faster-whisper, CTranslate2, or any CUDA
Python library. Transcription is always invoked as an external subprocess
(CLIPWRIGHT_WHISPER), keeping GPU acceleration completely separate from the package
install and preserving license independence. Any whisper-cli-compatible binary — CPU,
CUDA, Metal, ROCm — can be used by updating the environment variable alone.
MCP Tool
clipwright_transcribe(media, output, options?) — Transcribe audio/video file and generate output.otio / output.srt / output.vtt.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file clipwright_transcribe-0.5.1.tar.gz.
File metadata
- Download URL: clipwright_transcribe-0.5.1.tar.gz
- Upload date:
- Size: 18.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2ca0efb4bf6e111b4e38f8ed44102406e0f84f8fc994c384188b8c181666130f
|
|
| MD5 |
fa8edb8f894d025c87f8c991408784cf
|
|
| BLAKE2b-256 |
2403901ebead013f2d76b8d37d14ba5534630ab68f7cc322e61e6e2df657ad0e
|
Provenance
The following attestation bundles were made for clipwright_transcribe-0.5.1.tar.gz:
Publisher:
publish.yml on satoh-y-0323/clipwright
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clipwright_transcribe-0.5.1.tar.gz -
Subject digest:
2ca0efb4bf6e111b4e38f8ed44102406e0f84f8fc994c384188b8c181666130f - Sigstore transparency entry: 2047861660
- Sigstore integration time:
-
Permalink:
satoh-y-0323/clipwright@86843a8323afec4f66a3949337056fac1c7b88de -
Branch / Tag:
refs/tags/v0.32.0 - Owner: https://github.com/satoh-y-0323
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@86843a8323afec4f66a3949337056fac1c7b88de -
Trigger Event:
push
-
Statement type:
File details
Details for the file clipwright_transcribe-0.5.1-py3-none-any.whl.
File metadata
- Download URL: clipwright_transcribe-0.5.1-py3-none-any.whl
- Upload date:
- Size: 21.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e9a6e52e6a6d2ce546f854a7dda7b44136dd4ae89901fd243547d3a1f65c8177
|
|
| MD5 |
262a11c8b68a2f8c34367f8e7cf51ff5
|
|
| BLAKE2b-256 |
ae7b6d7a6d6cca89e525d5c24cdb6658013ab0f279b361f3cd4ef4c34b8ab3ba
|
Provenance
The following attestation bundles were made for clipwright_transcribe-0.5.1-py3-none-any.whl:
Publisher:
publish.yml on satoh-y-0323/clipwright
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
clipwright_transcribe-0.5.1-py3-none-any.whl -
Subject digest:
e9a6e52e6a6d2ce546f854a7dda7b44136dd4ae89901fd243547d3a1f65c8177 - Sigstore transparency entry: 2047861672
- Sigstore integration time:
-
Permalink:
satoh-y-0323/clipwright@86843a8323afec4f66a3949337056fac1c7b88de -
Branch / Tag:
refs/tags/v0.32.0 - Owner: https://github.com/satoh-y-0323
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@86843a8323afec4f66a3949337056fac1c7b88de -
Trigger Event:
push
-
Statement type: