Skip to main content

A tool to fetch YouTube video transcripts via Web UI and CLI

Project description

YouTube Transcript Fetcher

๐Ÿš€ Try Live Demo | โญ Star on GitHub | ๐Ÿ’ป CLI Guide | ๐Ÿ“– Docs

A powerful tool to fetch YouTube video transcripts via Web UI or CLI, with intelligent proxy support to bypass rate limiting.

Tests Python License

Features

  • Web UI: Browser-based interface for fetching transcripts
  • CLI: Command-line interface for automation and scripting
  • Smart Proxy Support: Automatic proxy configuration to bypass YouTube rate limiting
  • Multiple Languages: Fetch transcripts in different languages
  • Multiple Formats: Output as plain text or JSON
  • Smart Caching: Database-backed caching to avoid redundant API calls

Quick Start ๐Ÿš€

Option 1: Web UI (Easiest - No Installation) ๐ŸŒ

๐Ÿš€ Try Live Demo

Works instantly in your browser - no installation required!

Perfect for: Quick transcripts, testing, non-technical users


Option 2: CLI (Install Locally) ๐Ÿ’ป

Fetch transcripts from the command line:

# Install from source
git clone https://github.com/nilukush/youtube-transcript.git
cd youtube-transcript
pip install -e .

# Fetch transcript
ytt fetch "https://youtu.be/dQw4w9WgXcQ"

Coming soon to PyPI: pip install youtube-transcript-tools

Perfect for: Automation, scripting, power users


Option 3: Self-Hosted (Deploy Yourself) ๐Ÿ”ง

Deploy your own instance:

๐Ÿ“– Deployment Guide

Perfect for: Production use, custom configuration, full control


Features

CLI

# Fetch transcript by URL
ytt fetch "https://youtu.be/dQw4w9WgXcQ"

# Fetch by video ID
ytt fetch dQw4w9WgXcQ

# Save to file
ytt fetch dQw4w9WgXcQ -o transcript.txt

# Output as JSON
ytt fetch dQw4w9WgXcQ --json

Installation

From PyPI (Coming Soon)

pip install youtube-transcript-tools

From Source

git clone https://github.com/nilukush/youtube-transcript.git
cd youtube-transcript
pip install -e .

Development Installation

pip install -e ".[dev]"

Usage

Web UI

The Web UI provides the simplest way to fetch transcripts:

Starting the server:

uvicorn youtube_transcript.api.app:create_app --reload --host localhost --port 8888

Then open http://localhost:8888 in your browser.

Supported URL formats:

  • https://youtu.be/dQw4w9WgXcQ (shortened)
  • https://www.youtube.com/watch?v=dQw4w9WgXcQ (full URL)
  • dQw4w9WgXcQ (video ID only)

CLI

The CLI uses a fetch command to retrieve transcripts.

Basic usage:

ytt fetch "https://youtu.be/dQw4w4wWgXcQ"

Advanced options:

# Language preference
ytt fetch dQw4w9WgXcQ --lang en

# Multiple languages
ytt fetch dQw4w9WgXcQ --lang en,es,fr

# Save to file
ytt fetch dQw4w9WgXcQ -o transcript.txt

# JSON output
ytt fetch dQw4w9WgXcQ --json

# Verbose mode
ytt fetch dQw4w9WgXcQ --verbose

All options:

Usage: ytt fetch [OPTIONS] URL_OR_ID

Options:
  --lang, -l      TEXT  Preferred language codes (comma-separated)
  --output, -o    TEXT  Output file path
  --json                Output in JSON format
  --verbose            Show detailed information
  --help, -h           Show this message

Troubleshooting

"No such command" Error

Wrong:

ytt "https://youtu.be/dQw4w9WgXcQ"

Correct:

ytt fetch "https://youtu.be/dQw4w9WgXcQ"

"Transcript Not Found" Error

This means:

  • The video doesn't have captions/subtitles enabled
  • The transcript is disabled by the uploader
  • The video ID is incorrect

Verification: Check if the video has captions on YouTube:

  1. Open the video on YouTube
  2. Click the "..." (more) button
  3. Look for "Show transcript" option

Rate Limiting (HTTP 429)

If you experience rate limiting:

  1. The application automatically uses proxy configuration (if set by the service provider)
  2. Try again later - rate limits reset over time
  3. Some videos may have stricter rate limits than others

CLI Not Found

If ytt command is not found:

# Reinstall the package
pip install -e .

# Or use Python module directly
python -m youtube_transcript.cli fetch "https://youtu.be/dQw4w9WgXcQ"

Development

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=src/youtube_transcript --cov-report=html

# Run specific test file
pytest tests/test_fetcher.py -v

Code Quality

# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Type check
mypy src/

Project Structure

youtube-transcript/
โ”œโ”€โ”€ src/youtube_transcript/
โ”‚   โ”œโ”€โ”€ api/              # FastAPI endpoints and web routes
โ”‚   โ”œโ”€โ”€ cache/            # Redis caching layer
โ”‚   โ”œโ”€โ”€ config/           # Configuration management
โ”‚   โ”œโ”€โ”€ models/           # SQLModel database models
โ”‚   โ”œโ”€โ”€ repository/       # Database repository layer
โ”‚   โ”œโ”€โ”€ services/         # Business logic (fetcher, orchestrator)
โ”‚   โ”œโ”€โ”€ static/           # CSS and static assets
โ”‚   โ”œโ”€โ”€ templates/        # Jinja2 HTML templates
โ”‚   โ”œโ”€โ”€ utils/            # URL parsing utilities
โ”‚   โ””โ”€โ”€ cli.py            # CLI entry point
โ”œโ”€โ”€ tests/                # Pytest tests
โ””โ”€โ”€ pyproject.toml        # Project configuration

API Endpoints

The web server exposes the following endpoints:

  • GET / - Web UI homepage
  • GET /transcript?url=URL - Fetch transcript via GET
  • GET /transcript/{video_id} - Fetch transcript by video ID
  • GET /htmx/transcript?url=URL - HTMX endpoint for dynamic updates
  • GET /docs - Interactive API documentation (FastAPI auto-docs)

Performance

Metric Target Status
Cached Response p95 < 500ms โœ… Met
Uncached Response p95 < 10s โœ… Met
Test Coverage > 80% โœ… Met (100%)
URL Parse Success > 99.5% โœ… Met

Contributing

Contributions are welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Write tests for your changes
  4. Ensure all tests pass
  5. Submit a pull request

For Application Owners

If you're deploying this application as a service, see DEPLOYMENT.md for:

  • Proxy configuration
  • Environment variables
  • Production deployment
  • Scaling considerations

License

MIT License - see LICENSE file for details.

Acknowledgments

Support

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

youtube_transcript_tools-0.1.0.tar.gz (187.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

youtube_transcript_tools-0.1.0-py3-none-any.whl (38.0 kB view details)

Uploaded Python 3

File details

Details for the file youtube_transcript_tools-0.1.0.tar.gz.

File metadata

  • Download URL: youtube_transcript_tools-0.1.0.tar.gz
  • Upload date:
  • Size: 187.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for youtube_transcript_tools-0.1.0.tar.gz
Algorithm Hash digest
SHA256 21b7ff1c7351c04547894fcd9ecf5cb602db0b1ff7a1d1f0d6928ffeb243da94
MD5 23b40b19f7a7560854b7ab2e03844e69
BLAKE2b-256 76a3092cfee25cc79d8be1798990ce090ae48a57070b277e82cdd0e4744fe8fb

See more details on using hashes here.

File details

Details for the file youtube_transcript_tools-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for youtube_transcript_tools-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 10160cc67bb48e367cd03cbc1b9e82e490adf2abe39a4a27960b7932358b9466
MD5 216ce1cf1b91a47e60070d882b631436
BLAKE2b-256 d02b0da85411f9f1bc9ede49a410e3fc79c48b5aa40d40674295d6e614e85448

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page