A simple scraper for Youtube
Project description
Youtube Simple Scraper
This is a simple youtube scraper that uses the youtube API to get the videos metadata and comments of a channel.
Features
Scrape the following information of a channel:
- Channel metadata
- Videos metadata and comments
- Shorts metadata and comments
Installation
pip install youtube_simple_scraper
Usage
from youtube_simple_scraper.entities import GetChannelOptions
from youtube_simple_scraper.list_video_comments import ApiVideoCommentRepository, ApiShortVideoCommentRepository
from youtube_simple_scraper.list_videos import ApiVideoListRepository
from youtube_simple_scraper.logger import build_default_logger
from youtube_simple_scraper.stop_conditions import ListCommentMaxPagesStopCondition, ListVideoMaxPagesStopCondition
if __name__ == '__main__':
logger = build_default_logger()
video_comment_repo = ApiVideoCommentRepository()
short_comment_repo = ApiShortVideoCommentRepository()
repo = ApiVideoListRepository(
video_comment_repo=video_comment_repo,
shorts_comment_repo=short_comment_repo,
logger=logger,
)
opts = GetChannelOptions(
list_video_stop_conditions=[ListVideoMaxPagesStopCondition(2)],
list_video_comment_stop_conditions=[ListCommentMaxPagesStopCondition(2)],
list_short_stop_conditions=[ListVideoMaxPagesStopCondition(2)],
list_short_comment_stop_conditions=[ListCommentMaxPagesStopCondition(2)]
)
channel = repo.get_channel("IbaiLlanos", opts)
print(channel.id)
print(channel.videos[0].title)
print(channel.videos[0].comments[0].text)
print(channel.shorts[0].title)
print(channel.shorts[0].comments[0].text)
print(channel.model_dump_json(indent=2))
Example of the output channel object parsed to json:
{
"id": "UCaY_-ksFSQtTGk0y1HA_3YQ",
"name": "IbaiLlanos",
"target_id": "668be16f-0000-20de-b6a2-582429cfbdec",
"title": "Ibai",
"description": "contenido premium ▶️\n",
"subscriber_count": 11600000,
"video_count": 1400,
"videos": [
{
"id": "VFXu8gzcpNc",
"title": "EL RESTAURANTE MÁS ÚNICO AL QUE HE IDO NUNCA",
"description": "MI CANAL DE DIRECTOS: https://www.youtube.com/@Ibai_TV\nExtraído de mi canal de TWITCH: https://www.twitch.tv/ibai/\nMI PODCAST: \nhttps://www.youtube.com/channel/UC6jNDNkoOKQfB5djK2IBDoA\nTWITTER:...",
"date": "2024-06-02T19:18:27.647137",
"view_count": 1455817,
"like_count": 0,
"dislike_count": 0,
"comment_count": 0,
"thumbnail_url": "https://i.ytimg.com/vi/VFXu8gzcpNc/hqdefault.jpg?sqp=-oaymwEbCKgBEF5IVfKriqkDDggBFQAAiEIYAXABwAEG&rs=AOn4CLCEmoQtslruHk-droajdw0KJUI_KA",
"comments": [
{
"id": "UgzV8lY8eJ4dyHjl9Bp4AaABAg",
"text": "Todo muy rico pero....Y la cuenta?",
"user": "@eliasabregu2813",
"date": "2024-06-03T19:11:28.109467",
"likes": 0
},
{
"id": "UgwHtPZb8jprbCH-ysp4AaABAg",
"text": "Que humilde Ibai, comiendo todo para generar ingresos a los nuevos negocios",
"user": "@user-ui2sk7sr5i",
"date": "2024-06-03T19:04:28.112228",
"likes": 0
}
]
},
// More videos ...
],
"shorts": [
// the shorts videos and comments
]
}
Stop conditions
Videos stop conditions
- ListVideoMaxPagesStopCondition: Stops the scraping process when the number of pages scraped is greater than the specified value.
- ListVideoNeverStopCondition: The scraping process stop when all the videos of the channel are scraped.
Comments stop conditions
- ListCommentMaxPagesStopCondition: Stops the scraping process when the number of pages scraped is greater than the specified value.
- ListCommentNeverStopCondition: The scraping process stop when all the comments of the video are scraped.
The stop conditions are used to stop the scraping process. The following stop conditions are available:
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file youtube_simple_scraper-0.0.2.tar.gz
.
File metadata
- Download URL: youtube_simple_scraper-0.0.2.tar.gz
- Upload date:
- Size: 17.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 90b646d721ff5791c1cbb46672e0cd46ce729e41a7658283316dba498126985d |
|
MD5 | f12a98b01f8140925c8a90518065f367 |
|
BLAKE2b-256 | c20f2a7f1af9052741982126feda7ed09d831da1201dd40c92e515e06531fe95 |
File details
Details for the file youtube_simple_scraper-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: youtube_simple_scraper-0.0.2-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.0 CPython/3.12.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2078401c6c967f0205519d9c32542ee8e06cf05ad8c68580cbad2199c14ae23 |
|
MD5 | 9e89c0375c870f8e6943dd5e8c660d6d |
|
BLAKE2b-256 | 5b23c943e074e325f44503f96c12279d89847b200cb1a11432f65ff3afa67755 |