Skip to main content

A simple scraper for Youtube

Project description

Youtube Simple Scraper

This is a simple youtube scraper that uses the youtube API to get the videos metadata and comments of a channel.

Features

Scrape the following information of a channel:

  • Channel metadata
  • Videos metadata and comments
  • Shorts metadata and comments

Installation

pip install youtube_simple_scraper

Usage

from youtube_simple_scraper.entities import GetChannelOptions
from youtube_simple_scraper.list_video_comments import ApiVideoCommentRepository, ApiShortVideoCommentRepository
from youtube_simple_scraper.list_videos import ApiChannelRepository
from youtube_simple_scraper.logger import build_default_logger
from youtube_simple_scraper.stop_conditions import ListCommentMaxPagesStopCondition, ListVideoMaxPagesStopCondition

if __name__ == '__main__':
    logger = build_default_logger()
    video_comment_repo = ApiVideoCommentRepository()
    short_comment_repo = ApiShortVideoCommentRepository()
    repo = ApiChannelRepository(
        video_comment_repo=video_comment_repo,
        shorts_comment_repo=short_comment_repo,
        logger=logger,
    )
    opts = GetChannelOptions(
        list_video_stop_conditions=[ListVideoMaxPagesStopCondition(2)],
        list_video_comment_stop_conditions=[ListCommentMaxPagesStopCondition(2)],
        list_short_stop_conditions=[ListVideoMaxPagesStopCondition(2)],
        list_short_comment_stop_conditions=[ListCommentMaxPagesStopCondition(2)]
    )
    channel = repo.get_channel("IbaiLlanos", opts)
    print(channel.id)
    print(channel.videos[0].title)
    print(channel.videos[0].comments[0].text)
    print(channel.shorts[0].title)
    print(channel.shorts[0].comments[0].text)
    print(channel.model_dump_json(indent=2))

Example of the output channel object parsed to json:

{
  "id": "UCaY_-ksFSQtTGk0y1HA_3YQ",
  "name": "IbaiLlanos",
  "target_id": "668be16f-0000-20de-b6a2-582429cfbdec",
  "title": "Ibai",
  "description": "contenido premium ▶️\n",
  "subscriber_count": 11600000,
  "video_count": 1400,
  "videos": [
    {
      "id": "VFXu8gzcpNc",
      "title": "EL RESTAURANTE MÁS ÚNICO AL QUE HE IDO NUNCA",
      "description": "MI CANAL DE DIRECTOS: https://www.youtube.com/@Ibai_TV\nExtraído de mi canal de TWITCH: https://www.twitch.tv/ibai/\nMI PODCAST: \nhttps://www.youtube.com/channel/UC6jNDNkoOKQfB5djK2IBDoA\nTWITTER:...",
      "date": "2024-06-02T19:18:27.647137",
      "view_count": 1455817,
      "like_count": 0,
      "dislike_count": 0,
      "comment_count": 0,
      "thumbnail_url": "https://i.ytimg.com/vi/VFXu8gzcpNc/hqdefault.jpg?sqp=-oaymwEbCKgBEF5IVfKriqkDDggBFQAAiEIYAXABwAEG&rs=AOn4CLCEmoQtslruHk-droajdw0KJUI_KA",
      "comments": [
        {
          "id": "UgzV8lY8eJ4dyHjl9Bp4AaABAg",
          "text": "Todo muy rico pero....Y la cuenta?",
          "user": "@eliasabregu2813",
          "date": "2024-06-03T19:11:28.109467",
          "likes": 0
        },
        {
          "id": "UgwHtPZb8jprbCH-ysp4AaABAg",
          "text": "Que humilde Ibai, comiendo todo para generar ingresos a los nuevos negocios",
          "user": "@user-ui2sk7sr5i",
          "date": "2024-06-03T19:04:28.112228",
          "likes": 0
        }
      ]
    },
    // More videos ...
  ],
  "shorts": [
    // the shorts videos and comments
  ]
}

Stop conditions

Videos stop conditions

  • ListVideoMaxPagesStopCondition: Stops the scraping process when the number of pages scraped is greater than the specified value.
  • ListVideoNeverStopCondition: The scraping process stop when all the videos of the channel are scraped.

Comments stop conditions

  • ListCommentMaxPagesStopCondition: Stops the scraping process when the number of pages scraped is greater than the specified value.
  • ListCommentNeverStopCondition: The scraping process stop when all the comments of the video are scraped.

The stop conditions are used to stop the scraping process. The following stop conditions are available:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

youtube_simple_scraper-0.0.5.tar.gz (17.4 kB view details)

Uploaded Source

Built Distribution

youtube_simple_scraper-0.0.5-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file youtube_simple_scraper-0.0.5.tar.gz.

File metadata

File hashes

Hashes for youtube_simple_scraper-0.0.5.tar.gz
Algorithm Hash digest
SHA256 a8fe39c7d90bf95a930b5ba27b0b5ff37be62fbd20c77d57cf7a36a40191018d
MD5 34ea1ee9a4f70b47fbb0526f3bec14cc
BLAKE2b-256 4a31fb5a2416aebe314f307cc8482aa07315be5bc153c68ee4df47e86326787a

See more details on using hashes here.

File details

Details for the file youtube_simple_scraper-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for youtube_simple_scraper-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 3932dfb8cd5983f783f29d717ac47db5cc3e2f99a1f3986a0fd79fb169db304b
MD5 c63302e811dc2681d0cb7b1f9c05e8ed
BLAKE2b-256 81f32e3fd132d8428a92d1bceebcedc588f1164e9346b8a31a19b5a36261fb77

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page