Skip to main content

Fetch YouTube video transcripts with automatic Whisper fallback

Project description

ytranscript

Fetch YouTube video transcripts from the command line. Uses YouTube's caption API when available, falls back to local Whisper transcription for videos without captions.

Installation

pip install ytranscript

System requirement: ffmpeg must be installed for the Whisper fallback.

# macOS
brew install ffmpeg

# Ubuntu/Debian
apt install ffmpeg

Usage

# Basic — auto-detect language
ytranscript 'https://www.youtube.com/watch?v=VIDEO_ID'

# Markdown output with metadata
ytranscript URL -f md

# JSON output (for pipelines)
ytranscript URL -f json

# Include timestamps
ytranscript URL --timestamps

# Force Whisper transcription
ytranscript URL --whisper

# Specify language
ytranscript URL --lang zh

# Choose Whisper model (tiny/base/small/medium/large)
ytranscript URL --model small

Batch processing

# Multiple URLs at once
ytranscript URL1 URL2 URL3 --output-dir ./transcripts/

# From a file (one URL per line)
ytranscript --batch urls.txt --output-dir ./transcripts/

# Entire playlist or channel
ytranscript 'https://www.youtube.com/playlist?list=PLAYLIST_ID' --output-dir ./transcripts/

Options

Flag Description
-f plain|json|md Output format (default: plain)
--lang CODE Language code, e.g. en, zh, ja (default: auto-detect)
--timestamps Prefix each line with [MM:SS]
--whisper Force local Whisper transcription
--model SIZE Whisper model: tiny, base, small, medium, large (default: base)
--no-metadata Skip title/channel/duration fetch
--no-cache Bypass disk cache
--batch FILE Read URLs from a file
--output-dir DIR Write each transcript to DIR/{video_id}.{ext}

Caching

Transcripts are cached at ~/.cache/ytranscript/ by default. Subsequent calls for the same video are instant. Use --no-cache to force a fresh fetch.

Environment variables

Variable Description
YT_TRANSCRIPT_MODEL Default Whisper model size (e.g. small)

How it works

  1. Tries youtube-transcript-api — fast, no download needed
  2. If no captions exist, downloads audio via yt-dlp and transcribes locally with faster-whisper

This covers ~100% of videos, including those without any captions.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ytranscript-0.1.0.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ytranscript-0.1.0-py3-none-any.whl (7.0 kB view details)

Uploaded Python 3

File details

Details for the file ytranscript-0.1.0.tar.gz.

File metadata

  • Download URL: ytranscript-0.1.0.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for ytranscript-0.1.0.tar.gz
Algorithm Hash digest
SHA256 3dc69a0b6ac6c81bd580da8ed3d9f3566e4baeed995752204898750da04d1658
MD5 53470980ab7076877aa8522d02b0d359
BLAKE2b-256 e701580c61f2acbb2dc2bccf09021bfa3a621e40c9f790f43c6a51343d2bde4e

See more details on using hashes here.

File details

Details for the file ytranscript-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ytranscript-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for ytranscript-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 fc989842995516e18c5142d31cd3a7a2f847b0ff86fde76e05868f6216507627
MD5 d7ab53e757caa326128996454f1b4ce9
BLAKE2b-256 166caedc4a361227ba7c82564c5ffbaf69439b48eb65c32f4532cfd01a6ba8b7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page