Skip to main content

A Python library for scraping YouTube video data

Project description

NGTube

A comprehensive Python library for scraping YouTube data, including videos, comments, and channel profiles.

⚠️ Disclaimer

This library is provided for educational and research purposes only. Scraping YouTube data may violate YouTube's Terms of Service. Use at your own risk. The authors are not responsible for any misuse or legal consequences. Always respect robots.txt and implement appropriate rate limiting.

Features

  • Video Extraction: Extract detailed metadata from YouTube videos (title, views, likes, duration, tags, description, etc.)
  • Comment Extraction: Extract comments from videos, including loading additional comments via YouTube's internal API
  • Channel Extraction: Extract complete channel profile data (subscribers, description, featured video, video list with continuation support)
  • Flexible Video Loading: Load specific number of videos or all available videos from a channel
  • Clean Data Output: Structured JSON-compatible data output
  • Modular Design: Separate classes for different extraction tasks

Installation

🚀 Quick Install (Recommended)

pip install NGTube

That's it! NGTube is now available on PyPI and ready to use.

Option 1: Install from PyPI (Stable)

pip install NGTube

Option 2: Install from Source

  1. Clone or download the repository.
  2. Navigate to the project directory.
  3. Install the package using pip:
pip install .

Option 3: Manual Installation

  1. Clone or download the repository.
  2. Ensure you have Python 3.6+ installed.
  3. Install required dependencies:
pip install requests demjson3
  1. Copy the NGTube folder to your project directory or add it to your Python path.

Using setup.py

The setup.py file is used for packaging and installation. You can also install manually:

python setup.py install

However, using pip install . is recommended as it handles modern Python packaging better.

Quick Start

Extract Video Metadata

from NGTube import Video

url = "https://www.youtube.com/watch?v=y1XrJyFF1O0"
video = Video(url)
metadata = video.extract_metadata()

print("Title:", metadata['title'])
print("Views:", metadata['view_count'])
print("Likes:", metadata['like_count'])
print("Duration:", metadata['duration_seconds'], "seconds")

Extract Comments

from NGTube import Comments

url = "https://www.youtube.com/watch?v=y1XrJyFF1O0"
comments = Comments(url)
comment_data = comments.get_comments()

print(f"Total comments: {len(comment_data['comments'])}")
for comment in comment_data['comments'][:3]:
    print(f"{comment['author']}: {comment['text'][:50]}...")

Extract Channel Profile

from NGTube import Channel

url = "https://www.youtube.com/@HandOfUncut"
channel = Channel(url)

# Load first 10 videos
profile = channel.extract_profile(max_videos=10)

print("Channel Title:", profile['title'])
print("Subscribers:", profile['subscribers'])
print("Videos loaded:", profile['loaded_videos_count'])

# Load all videos
profile_all = channel.extract_profile(max_videos='all')
print("Total videos:", profile_all['loaded_videos_count'])

Detailed Usage

Video Class

from NGTube import Video

video = Video("https://www.youtube.com/watch?v=VIDEO_ID")
metadata = video.extract_metadata()

# Available metadata keys:
# - title, view_count, like_count, duration_seconds
# - channel_name, channel_id, subscriber_count
# - description, tags, category, is_private
# - upload_date, published_time_text

Comments Class

from NGTube import Comments

comments = Comments("https://www.youtube.com/watch?v=VIDEO_ID")
data = comments.get_comments()

# Returns dictionary with:
# - 'top_comment': list of top comments
# - 'comments': list of regular comments

# Each comment contains:
# - author, text, like_count, published_time_text
# - author_thumbnail, comment_id, reply_count

Channel Class

from NGTube import Channel

channel = Channel("https://www.youtube.com/@ChannelHandle")

# Extract profile with specific number of videos
profile = channel.extract_profile(max_videos=50)

# Extract profile with all videos (may take time)
profile = channel.extract_profile(max_videos='all')

# Available profile data:
# - title, description, channel_id, channel_url
# - keywords, is_family_safe, links
# - subscriber_count_text, view_count_text, video_count_text
# - subscribers, total_views, video_count (parsed numbers)
# - featured_video (dict with videoId, title, description)
# - videos (list of video dictionaries)
# - loaded_videos_count

Examples

See the examples/ directory for complete working examples:

  • basic_usage.py: Extract video metadata and comments
  • batch_processing.py: Process multiple videos
  • channel_usage.py: Extract channel profile data
  • WEB/: Web-based demo application showcasing all features

Run any example:

python examples/basic_usage.py

Web Demo

For an interactive web interface:

cd examples/WEB
pip install flask
python app.py

Then open http://127.0.0.1:5000 in your browser to try all NGTube features through a user-friendly interface.

Screenshots

Web Panel Home Figure: Home page of the web panel with all tabs

Video Tab Figure: Video tab for metadata extraction

Comments Tab Figure: Comments tab for comment extraction

Channel Tab Figure: Channel tab for channel profile extraction

Search Tab Figure: Search tab for YouTube search

API Reference

Core Classes

YouTubeCore

Base class for YouTube interactions.

  • __init__(url: str): Initialize with YouTube URL
  • fetch_html() -> str: Fetch HTML content
  • extract_ytinitialdata(html: str) -> dict: Extract ytInitialData
  • make_api_request(endpoint: str, payload: dict) -> dict: Make API requests

Video

Extract video metadata.

  • __init__(url: str): Initialize with video URL
  • extract_metadata() -> dict: Extract and return video metadata

Comments

Extract video comments.

  • __init__(url: str): Initialize with video URL
  • get_comments() -> dict: Extract and return comments data

Channel

Extract channel profile and videos.

  • __init__(url: str): Initialize with channel URL
  • extract_profile(max_videos: int | str = 200) -> dict: Extract profile data
    • max_videos: Number of videos to load, or 'all' for all videos

Utils Module

  • extract_number(text: str) -> int: Extract numbers from text (handles German formatting)
  • extract_links(text: str) -> list: Extract URLs from text

Data Structures

Video Metadata

{
  "title": "Video Title",
  "view_count": 299955,
  "duration_in_seconds": 6994,
  "description": "Video description...",
  "tags": ["tag1", "tag2"],
  "video_id": "VIDEO_ID",
  "channel_id": "UC...",
  "is_owner_viewing": false,
  "is_crawlable": true,
  "thumbnail": {...},
  "allow_ratings": true,
  "author": "Channel Name",
  "is_private": false,
  "is_unplugged_corpus": false,
  "is_live_content": false,
  "like_count": 8547,
  "channel_name": "Channel Name",
  "category": "Gaming",
  "publish_date": "2023-12-01",
  "upload_date": "2023-12-01",
  "family_safe": true,
  "channel_url": "https://...",
  "subscriber_count": 1400000
}

Comment Data

{
  "top_comment": [...],
  "comments": [
    {
      "author": "Username",
      "text": "Comment text",
      "likeCount": 196,
      "publishedTimeText": "vor 1 Tag",
      "authorThumbnail": "https://...",
      "commentId": "...",
      "replyCount": 1
    }
  ]
}

Channel Profile

{
  "title": "Channel Title",
  "description": "Channel description...",
  "channelId": "UC...",
  "channelUrl": "https://...",
  "keywords": "keyword1 keyword2",
  "isFamilySafe": true,
  "links": ["https://..."],
  "subscriberCountText": "159.000 Abonnenten",
  "viewCountText": "84.770 Aufrufe",
  "videoCountText": "2583 Videos",
  "subscribers": 159000,
  "total_views": 84770,
  "video_count": 2583,
  "featured_video": {
    "videoId": "...",
    "title": "Featured Video Title",
    "description": "Featured video description..."
  },
  "videos": [
    {
      "videoId": "...",
      "title": "Video Title",
      "publishedTimeText": "vor 1 Tag",
      "viewCountText": "40.773 Aufrufe",
      "lengthText": "1:02:58",
      "thumbnails": [...]
    }
  ],
  "loaded_videos_count": 1
}

Limitations

  • Rate Limiting: YouTube may rate-limit requests. Add delays between requests for bulk operations.
  • Comment Limits: Without authentication, typically 40-50 comments can be loaded per video.
  • Video Limits: Channel video extraction may be limited by YouTube's pagination.
  • Terms of Service: This library is for educational purposes. Respect YouTube's Terms of Service and robots.txt.

Troubleshooting

  • Import Errors: Ensure NGTube folder is in your Python path
  • API Errors: YouTube changes their internal APIs frequently. The library uses current endpoints as of December 2025.
  • Missing Data: Some videos/channels may have restricted data access

Contributing

This library is maintained for educational purposes. Feel free to submit issues or improvements.

License

This project can be used by anyone with attribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ngtube-1.0.1.tar.gz (18.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ngtube-1.0.1-py3-none-any.whl (22.4 kB view details)

Uploaded Python 3

File details

Details for the file ngtube-1.0.1.tar.gz.

File metadata

  • Download URL: ngtube-1.0.1.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ngtube-1.0.1.tar.gz
Algorithm Hash digest
SHA256 d2273a4a595533db5e091bbee25e4cdc863492a384bcb55d0f5514fc051ebcbd
MD5 153e910f2e1cd4f5f461a662d3f9488c
BLAKE2b-256 0d29e358893892fa3251aea72bf46d952a98eacad7ba9a5b6e94767d5f48c6a2

See more details on using hashes here.

Provenance

The following attestation bundles were made for ngtube-1.0.1.tar.gz:

Publisher: publish.yml on NGxDTV/NGTube-Youtube

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ngtube-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: ngtube-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 22.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ngtube-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d082f91fdeb4957f062275ececad03d3dfd4b802e2db55240555d1ba826923e5
MD5 e2cedfd87a72c68b592c129e3bcaa8ce
BLAKE2b-256 f45567850e3e532e9b16690c68f94e438289723d3a8c4f3836acc3e54507a528

See more details on using hashes here.

Provenance

The following attestation bundles were made for ngtube-1.0.1-py3-none-any.whl:

Publisher: publish.yml on NGxDTV/NGTube-Youtube

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page