CLI to transcribe YouTube audio via Whisper (local) or Gemini (cloud)

These details have not been verified by PyPI

Project links

Project description

TubeScribe (ytx) — YouTube Transcriber (Whisper / Metal via whisper.cpp)

CLI that downloads YouTube audio and produces transcripts and captions using:

Local Whisper (faster-whisper / CTranslate2)
Whisper.cpp (Metal acceleration on Apple Silicon)

Repository: https://github.com/prateekjain24/TubeScribe

Managed with venv+pip (recommended) or uv, using the src layout.

Features

One command: URL → audio → normalized WAV → transcript JSON + SRT captions
Engines: whisper (faster-whisper) and whispercpp (Metal via whisper.cpp)
Rich progress for download + transcription
Deterministic JSON (orjson) and SRT line wrapping

Requirements

Python >= 3.11
FFmpeg installed and on PATH
- Check: ffmpeg -version
- macOS: brew install ffmpeg
- Ubuntu/Debian: sudo apt-get update && sudo apt-get install -y ffmpeg
- Fedora: sudo dnf install -y ffmpeg
- Arch: sudo pacman -S ffmpeg
- Windows: winget install Gyan.FFmpeg or choco install ffmpeg

Install (dev)

Option A: venv + pip (recommended)
- cd ytx && python3.11 -m venv .venv && source .venv/bin/activate
- python -m pip install -U pip setuptools wheel
- python -m pip install -e .
- ytx --help
Option B: uv
- cd ytx && uv sync
- uv run ytx --help

Running locally without installing

From repo root:
- export PYTHONPATH="$(pwd)/ytx/src"
- cd ytx && python3 -m ytx.cli --help
- Example: python3 -m ytx.cli summarize-file 0jpcFxY_38k.json --write

Note: Avoid running the ytx console script from inside the ytx/ folder; Python may shadow the installed package. Use the module form or run from repo root.

Usage (CLI)

Whisper (CPU by default):
- ytx transcribe <url> --engine whisper --model small
Whisper (larger model):
- ytx transcribe <url> --engine whisper --model large-v3-turbo
Gemini (best‑effort timestamps):
- ytx transcribe <url> --engine gemini --timestamps chunked --fallback
Chapters + summaries:
- ytx transcribe <url> --by-chapter --parallel-chapters --chapter-overlap 2.0 --summarize-chapters --summarize
Engine options and timestamp policy:
- ytx transcribe <url> --engine-opts '{"utterances":true}' --timestamps native
Output dir:
- ytx transcribe <url> --output-dir ./artifacts
Verbose logging:
- ytx --verbose transcribe <url> --engine whisper
Health check:
- ytx health (ffmpeg, API key presence, network)
Summarize an existing transcript JSON:
- ytx summarize-file /path/to/<video_id>.json --write

Metal (Apple Silicon) via whisper.cpp

Build whisper.cpp with Metal: make -j METAL=1
Download a GGUF/GGML model (e.g., large-v3-turbo)
Run with whisper.cpp engine by passing a model file path:
- uv run ytx transcribe <url> --engine whispercpp --model /path/to/gguf-large-v3-turbo.bin
Auto-prefer whisper.cpp when device=metal (if whisper.cpp binary is available):
- Set env YTX_WHISPERCPP_BIN to the main binary path, and provide a model path as above
Tuning (env or .env):
- YTX_WHISPERCPP_NGL (GPU layers, default 35), YTX_WHISPERCPP_THREADS (CPU threads)

Outputs

JSON (<video_id>.json): TranscriptDoc
- keys: video_id, source_url, title, duration, language, engine, model, created_at, segments[], chapters?, summary?
- segment: {id, start, end, text, confidence?} (seconds for time)
SRT (<video_id>.srt): line-wrapped captions (2 lines max)
Cache artifacts (under XDG cache root): meta.json, summary.json, transcript and captions.

Configuration (.env)

Copy .env.example → .env, then adjust:
- GEMINI_API_KEY (for Gemini)
- YTX_ENGINE (default whisper), WHISPER_MODEL (e.g., large-v3-turbo)
- YTX_WHISPERCPP_BIN and YTX_WHISPERCPP_MODEL_PATH for whisper.cpp
- Optional: YTX_CACHE_DIR, YTX_OUTPUT_DIR, YTX_ENGINE_OPTS (JSON), and timeouts (YTX_NETWORK_TIMEOUT, etc.)

Restricted videos & cookies

Some videos are age/region restricted or private. The downloader supports cookies, but CLI flags are not yet wired.
Workarounds: run yt-dlp manually, or use the Python API (pass cookies_from_browser / cookies_file to downloader).
Error messages suggest cookies usage when restrictions are detected.

Performance Tips

faster‑whisper: compute_type=auto resolves to int8 on CPU, float16 on CUDA.
Model sizing: start with small/medium; use large-v3(-turbo) for best quality.
Metal (whisper.cpp): tune -ngl (30–40 typical on M‑series) and threads to maximize throughput.

Development

Structure: code in src/ytx/, CLI in src/ytx/cli.py, engines in src/ytx/engines/, exporters in src/ytx/exporters/.
Tests: pytest -q (add tests under ytx/tests/).
Lint/format (if configured): ruff check . / ruff format ..

Roadmap

Add VTT/TXT exporters, format selection (--formats json,srt,vtt,txt)
OpenAI/Deepgram/ElevenLabs engines via shared cloud base
More resilient chunking/alignment; diarization options where supported
CI + tests; docs polish; performance tuning

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.6

Sep 9, 2025

0.3.5

Sep 9, 2025

0.3.4

Sep 8, 2025

0.3.3

Sep 8, 2025

This version

0.3.2

Sep 8, 2025

0.3.1

Sep 8, 2025

0.3.0

Sep 8, 2025

0.2.4

Sep 8, 2025

0.2.2

Sep 8, 2025

0.2.1

Sep 8, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tubescribe-0.3.2.tar.gz (47.0 kB view details)

Uploaded Sep 8, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

tubescribe-0.3.2-py3-none-any.whl (63.5 kB view details)

Uploaded Sep 8, 2025 Python 3

File details

Details for the file tubescribe-0.3.2.tar.gz.

File metadata

Download URL: tubescribe-0.3.2.tar.gz
Upload date: Sep 8, 2025
Size: 47.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for tubescribe-0.3.2.tar.gz
Algorithm	Hash digest
SHA256	`c446f83c8bcc5fd7b660659b2465a111598dfb9b46ca0b9d3fda417531e12f52`
MD5	`390b4e058d59831f9ff5f2d61aea0a77`
BLAKE2b-256	`6048f658e7cca0edde77f098118dff23def1ea67170c9381d9b7b86db2a44abb`

See more details on using hashes here.

File details

Details for the file tubescribe-0.3.2-py3-none-any.whl.

File metadata

Download URL: tubescribe-0.3.2-py3-none-any.whl
Upload date: Sep 8, 2025
Size: 63.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for tubescribe-0.3.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`bae168fabacba527f9e488b841751e07332a185e37416397a738321c2007702a`
MD5	`2c9e402e2aff5ce417e39b6da9271ffd`
BLAKE2b-256	`bd9f50107ede1d15e40de7e89428f74be1ccdb29d9f8f2c22e7f73b0bcced269`

See more details on using hashes here.

tubescribe 0.3.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

TubeScribe (ytx) — YouTube Transcriber (Whisper / Metal via whisper.cpp)

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes