Extract video information from YouTube videos including title, description, channel name, publication date and views
Project description
YouTube Video Information Extractor
A robust Python library for extracting YouTube video metadata including title, description, channel name, publication date, and view count with multiple extraction strategies.
Features
- Multiple Extraction Strategies: YouTube Data API v3 (official), yt-dlp, pytubefix
- Automatic Fallback: Seamlessly switches between methods if one fails
- Simple Input: Accepts YouTube video IDs only (11 characters)
- Batch Processing: Extract information from multiple videos efficiently
- Multiple Output Formats: Text, JSON, CSV
- Command Line Interface: Easy-to-use CLI for quick extractions
- Python Library: Full programmatic access for integration
- Robust Error Handling: Graceful handling of failures with retry logic
- Rate Limiting: Built-in delays to respect YouTube's servers
Installation
pip install yt-info-extract
Optional Dependencies
For the best experience, install all extraction backends:
# For YouTube Data API v3 support (recommended)
pip install google-api-python-client
# For yt-dlp support (most robust fallback)
pip install yt-dlp
# For pytubefix support (lightweight fallback)
pip install pytubefix
Quick Start
Python Library Usage
from yt_info_extract import get_video_info
# Extract video information
info = get_video_info("jNQXAC9IVRw")
print(f"Title: {info['title']}")
print(f"Channel: {info['channel_name']}")
print(f"Views: {info['views']:,}")
print(f"Published: {info['publication_date']}")
Command Line Usage
# Extract video information
yt-info jNQXAC9IVRw
# Export to JSON
yt-info -f json -o video.json jNQXAC9IVRw
# Process multiple videos
yt-info --batch video_ids.txt --output-dir results/
# Use specific extraction strategy
yt-info -s api --api-key YOUR_KEY jNQXAC9IVRw
YouTube Data API v3 Setup (Recommended)
The YouTube Data API v3 is the official, most reliable method. It requires a free API key:
- Go to Google Cloud Console
- Create a new project or select an existing one
- Enable the YouTube Data API v3
- Create credentials (API Key)
- Restrict the API key to YouTube Data API v3
# Set your API key as environment variable
export YOUTUBE_API_KEY="your_api_key_here"
# Now use the library
yt-info jNQXAC9IVRw
Usage Examples
Basic Usage
from yt_info_extract import YouTubeVideoInfoExtractor
# Initialize extractor
extractor = YouTubeVideoInfoExtractor(api_key="your_key")
# Extract video info
info = extractor.get_video_info("jNQXAC9IVRw")
if info:
print(f"Title: {info['title']}")
print(f"Channel: {info['channel_name']}")
print(f"Views: {info['views']:,}")
print(f"Published: {info['publication_date']}")
print(f"Description: {info['description'][:100]}...")
Batch Processing
from yt_info_extract import get_video_info_batch
video_ids = ["jNQXAC9IVRw", "dQw4w9WgXcQ", "_OBlgSz8sSM"]
results = get_video_info_batch(video_ids)
for result in results:
if not result.get('error'):
print(f"{result['title']} - {result['views']:,} views")
Export Data
from yt_info_extract import get_video_info, export_video_info
# Get video info
info = get_video_info("jNQXAC9IVRw")
# Export to JSON
export_video_info(info, "video.json")
# Export batch results to CSV
batch_results = get_video_info_batch(["jNQXAC9IVRw", "dQw4w9WgXcQ"])
export_video_info(batch_results, "videos.csv", format_type="csv")
Video ID Format
from yt_info_extract import get_video_info
# Only video IDs are accepted (11 characters: A-Z, a-z, 0-9, _, -):
video_id = "jNQXAC9IVRw"
info = get_video_info(video_id)
print(f"✓ {video_id} -> {info['title']}")
# URLs are NOT supported - extract the video ID manually if needed:
# https://www.youtube.com/watch?v=jNQXAC9IVRw -> jNQXAC9IVRw
Command Line Interface
Basic Commands
# Extract video information
yt-info jNQXAC9IVRw
# Use different output formats
yt-info -f compact jNQXAC9IVRw
yt-info -f stats jNQXAC9IVRw
yt-info -f json jNQXAC9IVRw
# Export to file
yt-info -f json -o video.json jNQXAC9IVRw
yt-info -f csv -o video.csv jNQXAC9IVRw
Batch Processing
# Create a file with video IDs only (one per line)
echo "jNQXAC9IVRw" > video_ids.txt
echo "dQw4w9WgXcQ" >> video_ids.txt
# Process all videos
yt-info --batch video_ids.txt --output-dir results/
# With summary report
yt-info --batch video_ids.txt --summary --output-dir results/
API Key Usage
# Method 1: Environment variable
export YOUTUBE_API_KEY="your_api_key_here"
yt-info jNQXAC9IVRw
# Method 2: Command line argument
yt-info --api-key "your_api_key_here" jNQXAC9IVRw
# Force specific strategy
yt-info -s api --api-key "your_key" jNQXAC9IVRw
yt-info -s yt_dlp jNQXAC9IVRw
Utility Commands
# Test API key
yt-info --test-api
# List available strategies
yt-info --list-strategies
# Verbose output
yt-info -v jNQXAC9IVRw
Extraction Strategies
1. YouTube Data API v3 (Recommended)
- Pros: Official, reliable, comprehensive data, compliant with ToS
- Cons: Requires API key, has quota limits (10,000 units/day free)
- Best for: Production applications, commercial use, reliable automation
extractor = YouTubeVideoInfoExtractor(api_key="your_key", strategy="api")
2. yt-dlp (Most Robust Fallback)
- Pros: No API key needed, very robust, actively maintained
- Cons: Violates YouTube ToS, can break with YouTube updates
- Best for: Personal projects, research, when API quotas are insufficient
extractor = YouTubeVideoInfoExtractor(strategy="yt_dlp")
3. pytubefix (Lightweight Fallback)
- Pros: No API key needed, simple, lightweight
- Cons: Violates YouTube ToS, less robust than yt-dlp
- Best for: Simple scripts, minimal dependencies
extractor = YouTubeVideoInfoExtractor(strategy="pytubefix")
4. Auto Strategy (Default)
Automatically tries strategies in order of preference:
- YouTube Data API v3 (if API key available)
- yt-dlp (if installed)
- pytubefix (if installed)
extractor = YouTubeVideoInfoExtractor(strategy="auto") # Default
Data Structure
Each video information dictionary contains:
{
"title": "Video title",
"description": "Full video description",
"channel_name": "Channel name",
"publication_date": "2005-04-23T00:00:00Z", # ISO format
"views": 123456789, # Integer view count
"extraction_method": "youtube_api" # Method used
}
Configuration Options
extractor = YouTubeVideoInfoExtractor(
api_key="your_key", # YouTube Data API key
strategy="auto", # Extraction strategy
timeout=30, # Request timeout (seconds)
max_retries=3, # Maximum retry attempts
backoff_factor=0.75, # Exponential backoff factor
rate_limit_delay=0.1, # Delay between requests
)
Error Handling
The library handles errors gracefully:
info = get_video_info("invalid_video_id")
if info:
print(f"Success: {info['title']}")
else:
print("Failed to extract video information")
# For batch processing, check individual results
results = get_video_info_batch(["valid_id", "invalid_id"])
for result in results:
if result.get('error'):
print(f"Error: {result['error']}")
else:
print(f"Success: {result['title']}")
Legal and Compliance Notes
- YouTube Data API v3: Fully compliant with YouTube's Terms of Service
- yt-dlp and pytubefix: Violate YouTube's Terms of Service by scraping data
For commercial applications or production use, always use the YouTube Data API v3.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new functionality
- Submit a pull request
License
MIT License - see LICENSE file for details.
Support
- Documentation: GitHub README
- Issues: GitHub Issues
- API Key Setup: Google Cloud Console
Related Projects
- yt-ts-extract - YouTube transcript extraction
- yt-dlp - YouTube downloader (used as fallback)
- pytubefix - YouTube library (used as fallback)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file yt_info_extract-1.1.0.tar.gz.
File metadata
- Download URL: yt_info_extract-1.1.0.tar.gz
- Upload date:
- Size: 38.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8a507dbf4717a77abf4bd07e143ef147dda28722b8f2f149775346d684d8683d
|
|
| MD5 |
587875a4776c84385639b1905360e1b7
|
|
| BLAKE2b-256 |
e6596927a9dde3b79a5758399b940dcaffd47bea693917aee7a306d2c3afc707
|
File details
Details for the file yt_info_extract-1.1.0-py3-none-any.whl.
File metadata
- Download URL: yt_info_extract-1.1.0-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a1b9ad12db0aa242a2523b6a5204a382fe5361f004f9f56c873efc5c166e2c8c
|
|
| MD5 |
8040c13d36d70b4ad644558ef46258b5
|
|
| BLAKE2b-256 |
a791daf20e0432fac4f8f950df67e080f0f9f73588f2d4da2839d3c7cd63b290
|