claudetube

Let Claude watch YouTube videos - transcripts + on-demand frame extraction

These details have not been verified by PyPI

Project description

claudetube

Let AI watch and understand online videos.

claudetube downloads online videos, transcribes them with faster-whisper, and lets AI "see" specific moments by extracting frames on-demand. Built for Claude Code but works as a standalone Python library with any AI tool.

Supports 1,500+ video sites via yt-dlp including YouTube, Vimeo, Dailymotion, Twitch, TikTok, Twitter/X, Instagram, Reddit, and many more.

Why This Exists

Claude doesn't have native video input. When you share a YouTube link with Claude, it sees nothing—just a URL string.

Google's Gemini can process video natively: pass a URL, ask a question, get an answer. One API call. Claude can't do this (yet), so claudetube exists to bridge that gap.

I (Dan) built claudetube because I was using Claude to help me make a game, and I kept finding YouTube tutorials that explained exactly what I needed. The problem? I couldn't just show Claude the video.

Every other YouTube MCP tool just dumps the transcript and calls it a day. But when a tutorial says "look at this code here" or "notice how the sprite moves", the transcript alone is useless. I needed Claude to actually see what I was seeing—to look at the code on screen, read the diagrams, understand the visual context.

Read more about the vision

Honest Assessment: claudetube vs Native Video AI

Aspect	Gemini (native)	claudetube
UX	URL + question → answer	process_video → get_frames → synthesize
Sites	YouTube only (public)	1,500+ sites via yt-dlp
Caching	Reprocesses each time	Instant on second query
Cost	1fps × full duration	Extract only what you need
Precision	1fps sampling	Exact timestamps, HQ for code
Offline	No	Yes (cached content)

Where claudetube is worse: More complex. Requires multi-step orchestration. 40 tools to learn.

Where claudetube wins: Works on more sites, cheaper for repeated queries, finer control, works offline.

The goal: Close the UX gap with a streamlined single-call interface while preserving the power-user capabilities. See the roadmap.

Quick Start

Prerequisites

Python 3.10+

ffmpeg (system package)

# macOS
brew install ffmpeg

# Ubuntu/Debian
sudo apt install ffmpeg

deno (recommended for YouTube) -- Since yt-dlp 2026.01.29, deno is required for full YouTube support (JS challenge solving). Without it, only limited YouTube clients are available.
```
# macOS
brew install deno

# Linux
curl -fsSL https://deno.land/install.sh | sh
```

Install

git clone https://github.com/thoughtpunch/claudetube
cd claudetube
./install.sh

Or via pip (once published):

pip install claudetube[mcp]

Install as MCP Server (Claude Code)

Add claudetube directly to Claude Code as an MCP server:

# Install the package first
pip install claudetube[mcp]

# Register with Claude Code
claude mcp add --transport stdio claudetube -- claudetube-mcp

Or add to your .mcp.json / ~/.claude.json:

{
  "mcpServers": {
    "claudetube": {
      "type": "stdio",
      "command": "claudetube-mcp"
    }
  }
}

Then restart Claude Code. All 40+ MCP tools will be available automatically.

Traditional Install

The installer does three things:

Creates a Python venv at ~/.claudetube/venv/
Installs the claudetube package + dependencies (yt-dlp, faster-whisper)
Copies slash commands to ~/.claude/commands/ (global to all Claude Code sessions)

Restart Claude Code after installing.

Works from any Claude Code session

The installer puts slash commands in ~/.claude/commands/, which is the global commands directory. Every Claude Code instance on your machine will have /yt available -- no per-project setup needed.

Why not a pre-built binary?

claudetube depends on faster-whisper (C++ transcription engine) and ffmpeg (system media tool). These have platform-specific native code that can't be bundled into a single static binary. The install script handles all of this automatically.

Usage with Claude Code

/yt https://youtube.com/watch?v=abc123 how did they make the sprites?
/yt https://vimeo.com/123456789 summarize the key points
/yt https://twitter.com/user/status/123 what is this video about?

Claude will:

Download and transcribe the video (~60s first time, cached after)
Read the transcript
If needed, extract frames to "see" specific moments
Answer your question

Commands

Command	Purpose
`/yt <url> [question]`	Analyze a video
`/yt:ask <url> <question>`	Simplest way - auto-processes and answers
`/yt:see <id> <timestamp>`	Quick frames (general visuals)
`/yt:hq <id> <timestamp>`	HQ frames (code, text, diagrams)
`/yt:transcribe <id> [model]`	Transcribe with Whisper (or return cached)
`/yt:transcript <id>`	Read cached transcript
`/yt:scenes <id>`	Get scene structure and boundaries
`/yt:find <id> <query>`	Find moments matching a query
`/yt:watch <id> <question>`	Actively watch and reason about a video
`/yt:deep <id>`	Deep analysis (OCR, entities, code detection)
`/yt:focus <id> <start> <end>`	Exhaustive frame-by-frame analysis of a section
`/yt:list`	List all cached videos

Python API

from claudetube import process_video, transcribe_video, get_frames_at, get_hq_frames_at

# Process a video (downloads, transcribes, caches)
result = process_video("https://youtube.com/watch?v=VIDEO_ID")
print(result.transcript_txt.read_text())

# Standalone Whisper transcription (cache-first, no full processing)
result = transcribe_video("VIDEO_ID", whisper_model="small")
print(result["source"])  # "cached" or "whisper"

# Extract frames at a specific timestamp
frames = get_frames_at("VIDEO_ID", start_time=120, duration=10)

# Extract HQ frames for reading code/text
hq_frames = get_hq_frames_at("VIDEO_ID", start_time=120, duration=5)

How It Works

Download -- Fetches lowest quality video (144p) for speed
Transcribe -- Uses faster-whisper with batched inference
Cache -- Stores everything at ~/.claudetube/cache/{VIDEO_ID}/
Drill-in -- Extract frames on-demand when visual context is needed

Data Location

All claudetube data is stored under ~/.claudetube/ by default:

~/.claudetube/
├── config.yaml              # User configuration
├── db/
│   ├── claudetube.db        # Metadata database
│   └── claudetube-vectors.db # Vector embeddings
├── cache/
│   └── {video_id}/          # Per-video cache
│       ├── state.json       # Metadata (title, description, tags)
│       ├── audio.mp3        # Extracted audio
│       ├── audio.srt        # Timestamped transcript
│       ├── audio.txt        # Plain text transcript
│       ├── thumbnail.jpg    # Video thumbnail
│       ├── drill/           # Quick frames (480p)
│       ├── hq/              # High-quality frames (1280p)
│       ├── scenes/          # Scene segmentation data
│       └── entities/        # People tracking, knowledge graph
└── logs/                    # Application logs (future)

Configuration

Override the root directory: Set CLAUDETUBE_ROOT environment variable

Override just the cache directory: Configuration priority (highest first):

Environment variable: CLAUDETUBE_CACHE_DIR=/path/to/cache
Project config: .claudetube/config.yaml in your project
User config: ~/.claudetube/config.yaml
Default: ~/.claudetube/cache

Example project config:

# .claudetube/config.yaml
cache_dir: ./video_cache

See Configuration Guide for details.

Architecture

claudetube uses a provider-based architecture with a modular design. Video downloading is handled through yt-dlp (1,500+ sites), while AI capabilities (transcription, vision analysis, reasoning, embeddings) are served by a configurable provider system supporting 11 providers (OpenAI, Anthropic, Google, Deepgram, AssemblyAI, Ollama, Voyage, and more). The MCP server exposes 40 tools for video processing, scene analysis, entity extraction, knowledge graphs, and accessibility features. See Architecture for details.

Development

git clone https://github.com/thoughtpunch/claudetube
cd claudetube
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
pytest

Linting

ruff check src/ tests/
ruff format src/ tests/
mypy src/

Documentation

Full documentation is available in the documentation/ folder:

Getting Started - Installation, quick start, MCP setup
Core Concepts - Video understanding, transcripts, frames, scenes
Architecture - Modules, data flow, tool wrappers
Vision - The problem space, roadmap, what makes claudetube different

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Fork the repository
Create a feature branch (git checkout -b feature/my-feature)
Run tests and linting before committing
Open a pull request against main

Legal

This project is not affiliated with, endorsed by, or associated with YouTube, Google, or Alphabet Inc. "YouTube" is a trademark of Google LLC. This software is an independent, open-source tool that interacts with publicly available video content through third-party libraries (yt-dlp). Users are solely responsible for ensuring their use of this software complies with all applicable terms of service and laws.

License

MIT -- free to use, modify, and distribute.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

1.0.0

Feb 5, 2026

0.1.1

Jan 27, 2026

0.1.0

Jan 27, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

claudetube-1.0.0.tar.gz (601.0 kB view details)

Uploaded Feb 5, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

claudetube-1.0.0-py3-none-any.whl (422.1 kB view details)

Uploaded Feb 5, 2026 Python 3

File details

Details for the file claudetube-1.0.0.tar.gz.

File metadata

Download URL: claudetube-1.0.0.tar.gz
Upload date: Feb 5, 2026
Size: 601.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claudetube-1.0.0.tar.gz
Algorithm	Hash digest
SHA256	`4e33116e93bd760188f4d727db61e48cb5a48d8c3cd67c5948a22247a58f4c14`
MD5	`735a5c8575800db1105b07a2b4b3b5a9`
BLAKE2b-256	`ec22ba683865462555d42a2aeed8ec65cdb294dd543a8d666b233733b6bb99fd`

See more details on using hashes here.

Provenance

The following attestation bundles were made for claudetube-1.0.0.tar.gz:

Publisher: publish.yml on thoughtpunch/claudetube

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: claudetube-1.0.0.tar.gz
- Subject digest: 4e33116e93bd760188f4d727db61e48cb5a48d8c3cd67c5948a22247a58f4c14
- Sigstore transparency entry: 915897035
- Sigstore integration time: Feb 5, 2026
Source repository:
- Permalink: thoughtpunch/claudetube@8617c1681155f9ecb2c0a306f1054d36a783246d
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/thoughtpunch
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8617c1681155f9ecb2c0a306f1054d36a783246d
- Trigger Event: push

File details

Details for the file claudetube-1.0.0-py3-none-any.whl.

File metadata

Download URL: claudetube-1.0.0-py3-none-any.whl
Upload date: Feb 5, 2026
Size: 422.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for claudetube-1.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`e8474d1b93a028efd0e8f293fd90a28116d60f2100bec1abe77a085f351cc10a`
MD5	`a21d79fee76a696f7283ae3851c1096b`
BLAKE2b-256	`6b903114ccca7144ca8de18ac6d48788e76ac9e50a02bc58604866f77da55581`

See more details on using hashes here.

Provenance

The following attestation bundles were made for claudetube-1.0.0-py3-none-any.whl:

Publisher: publish.yml on thoughtpunch/claudetube

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: claudetube-1.0.0-py3-none-any.whl
- Subject digest: e8474d1b93a028efd0e8f293fd90a28116d60f2100bec1abe77a085f351cc10a
- Sigstore transparency entry: 915897112
- Sigstore integration time: Feb 5, 2026
Source repository:
- Permalink: thoughtpunch/claudetube@8617c1681155f9ecb2c0a306f1054d36a783246d
- Branch / Tag: refs/tags/v1.0.0
- Owner: https://github.com/thoughtpunch
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@8617c1681155f9ecb2c0a306f1054d36a783246d
- Trigger Event: push

claudetube 1.0.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

claudetube

Why This Exists

Honest Assessment: claudetube vs Native Video AI

Quick Start

Prerequisites

Install

Install as MCP Server (Claude Code)

Traditional Install

Works from any Claude Code session

Why not a pre-built binary?

Usage with Claude Code

Commands

Python API

How It Works

Data Location

Configuration

Architecture

Development

Linting

Documentation

Contributing

Legal

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance