Download YouTube video transcripts using yt-dlp
Project description
yt-transcript-dl
Download YouTube video transcripts in multiple formats using yt-dlp.
Features
- Multiple output formats: TXT, SRT, VTT, JSON
- Download from videos, channels, and playlists
- Incremental sync with state tracking
- Configuration file support (TOML)
- Batch processing
- Custom filename patterns
- Video metadata inclusion
- Language selection
- Retry logic and rate limiting
Installation
pip install yt-transcript-dl
Or install from source:
git clone https://github.com/rk/yt-transcript-dl.git
cd yt-transcript-dl
pip install -e .
Quick Start
Download a single video transcript
yt-transcript-dl https://youtube.com/watch?v=VIDEO_ID
Download in SRT format
yt-transcript-dl https://youtube.com/watch?v=VIDEO_ID --format srt
Download entire channel
yt-transcript-dl https://youtube.com/@channelname -o ./transcripts
Usage
yt-transcript-dl [OPTIONS] [URL]
Output Formats
Use --format (or -f) to specify output format:
txt- Plain text (default)srt- SubRip subtitle formatvtt- WebVTT subtitle formatjson- JSON with segments and metadataall- Generate all formats
# SRT format (for video players)
yt-transcript-dl URL --format srt
# All formats at once
yt-transcript-dl URL --format all
Incremental Sync
Skip already downloaded videos using sync state:
# First download
yt-transcript-dl https://youtube.com/@channel -o ./channel
# Later: only download new videos
yt-transcript-dl https://youtube.com/@channel -o ./channel --sync
Sync options:
--sync- Only download videos newer than last sync--overwrite- Force re-download existing files--force-full- Ignore sync state and download all
Configuration Files
Create a configuration file to set defaults:
# Generate sample config (global)
yt-transcript-dl --init-config ~/.config/yt-transcript-dl/config.toml
# Or create project-specific config
yt-transcript-dl --init-config .yt-transcript-dl.toml
Configuration locations (checked in order):
./.yt-transcript-dl.toml(project-specific, highest priority)~/.config/yt-transcript-dl/config.toml(global user config)
CLI flags override config file settings.
Example config:
lang = "en"
format = "srt"
output_dir = "./transcripts"
include_metadata = true
embed_description = true
filename_pattern = "{channel}_{date}_{title}"
retry = 5
delay = 1.0
See CONFIG_EXAMPLES.md for comprehensive configuration examples and use cases.
Options:
--init-config PATH- Create sample configuration file at specified path--no-config- Ignore all configuration files
Options
Basic Options
-l, --lang TEXT Language code for transcript (default: en)
-o, --output-dir PATH Output directory (default: current directory)
-m, --include-metadata Include video metadata in output file
-d, --description Save video description to separate file
--embed-description Include video description in transcript file (txt/json only)
-p, --filename-pattern Custom filename pattern (tokens: {title}, {channel}, {date}, {id})
Batch Processing
-i, --input-file PATH File containing list of URLs (one per line)
Output Formats
-f, --format [txt|srt|vtt|json|all]
Output format (default: txt)
Sync Options
--overwrite Force re-download of existing files
--sync Only download videos newer than last sync
--force-full Ignore sync state and download all videos
Advanced Options
-v, --verbose Enable verbose logging
--log-file PATH Save logs to file
--retry INTEGER Number of retry attempts for failed downloads (default: 3)
--delay FLOAT Delay in seconds between requests (default: 0)
Configuration
--init-config PATH Create sample configuration file
--no-config Ignore configuration files
Utility
-V, --version Show version and exit
--help Show help message and exit
Examples
See examples/EXAMPLES.md for comprehensive examples.
Basic Examples
# Download with Spanish subtitles
yt-transcript-dl https://youtube.com/watch?v=xxxxx --lang es
# Save to specific directory with metadata
yt-transcript-dl https://youtube.com/watch?v=xxxxx -o ./transcripts -m
# Download playlist in SRT format
yt-transcript-dl "https://youtube.com/playlist?list=PLxxx" --format srt
# Batch process URLs with custom naming
yt-transcript-dl --input-file urls.txt \
--filename-pattern "{channel}_{date}_{title}" \
-o ./batch
Advanced Examples
# Archive channel with all formats and metadata
yt-transcript-dl https://youtube.com/@channel \
--format all \
--include-metadata \
--description \
--delay 1 \
-o ./archive
# Incremental channel sync
yt-transcript-dl https://youtube.com/@channel -o ./channel --sync
Output
Plain Text (TXT)
Clean transcript text, optionally with metadata header.
SubRip (SRT)
Standard subtitle format with timing:
1
00:00:00,000 --> 00:00:05,000
First subtitle segment
2
00:00:05,000 --> 00:00:10,000
Second subtitle segment
WebVTT (VTT)
Web Video Text Tracks format:
WEBVTT
00:00:00.000 --> 00:00:05.000
First subtitle segment
00:00:05.000 --> 00:00:10.000
Second subtitle segment
JSON
Structured format with segments and metadata:
{
"segments": [
{
"start": 0.0,
"end": 5.0,
"text": "First subtitle segment"
}
],
"metadata": {
"title": "Video Title",
"channel": "Channel Name",
"url": "https://youtube.com/watch?v=...",
"language": "en",
"is_auto_generated": false
}
}
Requirements
- Python 3.10+
- yt-dlp
- click
- tomli (Python <3.11 only)
Troubleshooting
No subtitles available
Some videos don't have captions. Try:
- Using
--lang autofor auto-generated subtitles (coming in future release) - Checking if the video has captions on YouTube
Rate limiting
If downloading many videos, use --delay:
yt-transcript-dl --input-file urls.txt --delay 2
Failed downloads
Increase retry attempts:
yt-transcript-dl URL --retry 5
Enable verbose logging to see detailed errors:
yt-transcript-dl URL --verbose
Development
Running Tests
pip install -e ".[dev]"
pytest
Project Structure
yt_transcript_dl/
├── cli.py # Command-line interface
├── downloader.py # Core download logic
├── formatters.py # Output format handlers
├── sync_state.py # Incremental sync tracking
├── config.py # Configuration file support
└── utils.py # Utility functions
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Changelog
See CHANGELOG.md for version history.
Related Projects
- yt-dlp - YouTube downloader (used internally)
- youtube-transcript-api - Alternative transcript API
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ytdl_transcript-0.1.0.tar.gz.
File metadata
- Download URL: ytdl_transcript-0.1.0.tar.gz
- Upload date:
- Size: 28.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
95030016a38c529b83174fe73d9faf5036f7474e2c7c585cfff034d0fb907910
|
|
| MD5 |
408fd9a279904f50469c1e2b8f314557
|
|
| BLAKE2b-256 |
4cd7598281313fcf3376826ba4af0a40ba5cb21aed0a2eb67d2843f333ef8f6e
|
File details
Details for the file ytdl_transcript-0.1.0-py3-none-any.whl.
File metadata
- Download URL: ytdl_transcript-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8864b4331286787f61eb79be5600629f03228d4a0496cc274df1c9108e7e63cd
|
|
| MD5 |
89e04b2dd3410f14b335216d7f27c74a
|
|
| BLAKE2b-256 |
afd13ac2c7f48d88d3b1a36387c918cb2e67d2e91b656f1d8e5e048e779d91c3
|