Skip to main content

Extract video information from YouTube videos including title, description, channel name, publication date and views

Project description

YouTube Video Information Extractor

CI PyPI version Python Support License: MIT Code Coverage Code Quality

A robust Python library for extracting YouTube video metadata including title, description, channel name, publication date, and view count with multiple extraction strategies.

Features

  • Multiple Extraction Strategies: YouTube Data API v3 (official), yt-dlp, pytubefix
  • Automatic Fallback: Seamlessly switches between methods if one fails
  • Simple Input: Accepts YouTube video IDs only (11 characters)
  • Batch Processing: Extract information from multiple videos efficiently
  • Multiple Output Formats: Text, JSON, CSV
  • Command Line Interface: Easy-to-use CLI for quick extractions
  • Python Library: Full programmatic access for integration
  • Robust Error Handling: Graceful handling of failures with retry logic
  • Rate Limiting: Built-in delays to respect YouTube's servers

Installation

pip install yt-info-extract

Optional Dependencies

For the best experience, install all extraction backends:

# For YouTube Data API v3 support (recommended)
pip install google-api-python-client

# For yt-dlp support (most robust fallback)
pip install yt-dlp

# For pytubefix support (lightweight fallback)
pip install pytubefix

Quick Start

Python Library Usage

from yt_info_extract import get_video_info

# Extract video information
info = get_video_info("jNQXAC9IVRw")

print(f"Title: {info['title']}")
print(f"Channel: {info['channel_name']}")
print(f"Views: {info['views']:,}")
print(f"Published: {info['publication_date']}")

Command Line Usage

# Extract video information
yt-info jNQXAC9IVRw

# Export to JSON
yt-info -f json -o video.json jNQXAC9IVRw

# Process multiple videos
yt-info --batch video_ids.txt --output-dir results/

# Use specific extraction strategy
yt-info -s api --api-key YOUR_KEY jNQXAC9IVRw

YouTube Data API v3 Setup (Recommended)

The YouTube Data API v3 is the official, most reliable method. It requires a free API key:

  1. Go to Google Cloud Console
  2. Create a new project or select an existing one
  3. Enable the YouTube Data API v3
  4. Create credentials (API Key)
  5. Restrict the API key to YouTube Data API v3
# Set your API key as environment variable
export YOUTUBE_API_KEY="your_api_key_here"

# Now use the library
yt-info jNQXAC9IVRw

Usage Examples

Basic Usage

from yt_info_extract import YouTubeVideoInfoExtractor

# Initialize extractor
extractor = YouTubeVideoInfoExtractor(api_key="your_key")

# Extract video info
info = extractor.get_video_info("jNQXAC9IVRw")

if info:
    print(f"Title: {info['title']}")
    print(f"Channel: {info['channel_name']}")
    print(f"Views: {info['views']:,}")
    print(f"Published: {info['publication_date']}")
    print(f"Description: {info['description'][:100]}...")

Batch Processing

from yt_info_extract import get_video_info_batch

video_ids = ["jNQXAC9IVRw", "dQw4w9WgXcQ", "_OBlgSz8sSM"]
results = get_video_info_batch(video_ids)

for result in results:
    if not result.get('error'):
        print(f"{result['title']} - {result['views']:,} views")

Export Data

from yt_info_extract import get_video_info, export_video_info

# Get video info
info = get_video_info("jNQXAC9IVRw")

# Export to JSON
export_video_info(info, "video.json")

# Export batch results to CSV
batch_results = get_video_info_batch(["jNQXAC9IVRw", "dQw4w9WgXcQ"])
export_video_info(batch_results, "videos.csv", format_type="csv")

Video ID Format

from yt_info_extract import get_video_info

# Only video IDs are accepted (11 characters: A-Z, a-z, 0-9, _, -):
video_id = "jNQXAC9IVRw"
info = get_video_info(video_id)
print(f"✓ {video_id} -> {info['title']}")

# URLs are NOT supported - extract the video ID manually if needed:
# https://www.youtube.com/watch?v=jNQXAC9IVRw -> jNQXAC9IVRw

Command Line Interface

Basic Commands

# Extract video information
yt-info jNQXAC9IVRw

# Use different output formats
yt-info -f compact jNQXAC9IVRw
yt-info -f stats jNQXAC9IVRw
yt-info -f json jNQXAC9IVRw

# Export to file
yt-info -f json -o video.json jNQXAC9IVRw
yt-info -f csv -o video.csv jNQXAC9IVRw

Batch Processing

# Create a file with video IDs only (one per line)
echo "jNQXAC9IVRw" > video_ids.txt
echo "dQw4w9WgXcQ" >> video_ids.txt

# Process all videos
yt-info --batch video_ids.txt --output-dir results/

# With summary report
yt-info --batch video_ids.txt --summary --output-dir results/

API Key Usage

# Method 1: Environment variable
export YOUTUBE_API_KEY="your_api_key_here"
yt-info jNQXAC9IVRw

# Method 2: Command line argument
yt-info --api-key "your_api_key_here" jNQXAC9IVRw

# Force specific strategy
yt-info -s api --api-key "your_key" jNQXAC9IVRw
yt-info -s yt_dlp jNQXAC9IVRw

Utility Commands

# Test API key
yt-info --test-api

# List available strategies
yt-info --list-strategies

# Verbose output
yt-info -v jNQXAC9IVRw

Extraction Strategies

1. YouTube Data API v3 (Recommended)

  • Pros: Official, reliable, comprehensive data, compliant with ToS
  • Cons: Requires API key, has quota limits (10,000 units/day free)
  • Best for: Production applications, commercial use, reliable automation
extractor = YouTubeVideoInfoExtractor(api_key="your_key", strategy="api")

2. yt-dlp (Most Robust Fallback)

  • Pros: No API key needed, very robust, actively maintained
  • Cons: Violates YouTube ToS, can break with YouTube updates
  • Best for: Personal projects, research, when API quotas are insufficient
extractor = YouTubeVideoInfoExtractor(strategy="yt_dlp")

3. pytubefix (Lightweight Fallback)

  • Pros: No API key needed, simple, lightweight
  • Cons: Violates YouTube ToS, less robust than yt-dlp
  • Best for: Simple scripts, minimal dependencies
extractor = YouTubeVideoInfoExtractor(strategy="pytubefix")

4. Auto Strategy (Default)

Automatically tries strategies in order of preference:

  1. YouTube Data API v3 (if API key available)
  2. yt-dlp (if installed)
  3. pytubefix (if installed)
extractor = YouTubeVideoInfoExtractor(strategy="auto")  # Default

Data Structure

Each video information dictionary contains:

{
    "title": "Video title",
    "description": "Full video description",
    "channel_name": "Channel name",
    "publication_date": "2005-04-23T00:00:00Z",  # ISO format
    "views": 123456789,  # Integer view count
    "extraction_method": "youtube_api"  # Method used
}

Configuration Options

extractor = YouTubeVideoInfoExtractor(
    api_key="your_key",           # YouTube Data API key
    strategy="auto",              # Extraction strategy
    timeout=30,                   # Request timeout (seconds)
    max_retries=3,                # Maximum retry attempts
    backoff_factor=0.75,          # Exponential backoff factor
    rate_limit_delay=0.1,         # Delay between requests
)

Error Handling

The library handles errors gracefully:

info = get_video_info("invalid_video_id")
if info:
    print(f"Success: {info['title']}")
else:
    print("Failed to extract video information")

# For batch processing, check individual results
results = get_video_info_batch(["valid_id", "invalid_id"])
for result in results:
    if result.get('error'):
        print(f"Error: {result['error']}")
    else:
        print(f"Success: {result['title']}")

Legal and Compliance Notes

  • YouTube Data API v3: Fully compliant with YouTube's Terms of Service
  • yt-dlp and pytubefix: Violate YouTube's Terms of Service by scraping data

For commercial applications or production use, always use the YouTube Data API v3.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Support

Related Projects

  • yt-ts-extract - YouTube transcript extraction
  • yt-dlp - YouTube downloader (used as fallback)
  • pytubefix - YouTube library (used as fallback)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_info_extract-1.1.0.tar.gz (38.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

yt_info_extract-1.1.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file yt_info_extract-1.1.0.tar.gz.

File metadata

  • Download URL: yt_info_extract-1.1.0.tar.gz
  • Upload date:
  • Size: 38.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for yt_info_extract-1.1.0.tar.gz
Algorithm Hash digest
SHA256 8a507dbf4717a77abf4bd07e143ef147dda28722b8f2f149775346d684d8683d
MD5 587875a4776c84385639b1905360e1b7
BLAKE2b-256 e6596927a9dde3b79a5758399b940dcaffd47bea693917aee7a306d2c3afc707

See more details on using hashes here.

File details

Details for the file yt_info_extract-1.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for yt_info_extract-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a1b9ad12db0aa242a2523b6a5204a382fe5361f004f9f56c873efc5c166e2c8c
MD5 8040c13d36d70b4ad644558ef46258b5
BLAKE2b-256 a791daf20e0432fac4f8f950df67e080f0f9f73588f2d4da2839d3c7cd63b290

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page