A powerful, asynchronous web crawler designed specifically for YouTube.
Project description
YouTube Crawler
A powerful, asynchronous web crawler designed specifically for YouTube. Built to efficiently interact with web pages in a headless mode, this tool facilitates the seamless extraction of YouTube video comments and video information directly from the platform.
Features
- Asynchronous Crawling: Uses Python's asynchronous features for fast and efficient data extraction.
- Headless Interaction: Operates seamlessly without opening a web browser.
- Modular Design: Built with modularity in mind, allowing for easy extension and maintenance.
Modules
1. YoutubeCommentCrawler
Extract comments from any given YouTube video.
Usage:
from youtube_crawler import YoutubeCommentCrawler
with YoutubeCommentCrawler() as crawler:
comments = await crawler.crawl('YOUR_YOUTUBE_VIDEO_URL', n_target=NUMBER_OF_COMMENTS_TO_FETCH)
2. YoutubeVideoInfoCrawler
Search for videos based on a keyword and extract their information.
Usage:
from youtube_crawler import YoutubeVideoInfoCrawler
with YoutubeVideoInfoCrawler() as crawler:
videos_info = await crawler.crawl('YOUR_SEARCH_TERM', n_target=NUMBER_OF_VIDEOS_TO_FETCH, filter_options=YOUR_FILTER_OPTIONS)
Getting Started
-
Clone the repository
git clone https://github.com/stevieflyer/youtube_crawler.git
-
Navigate to the project directory
cd youtube_crawler
-
Install the dependencies
pip install -r requirements.txt
-
Start crawling
Use the examples provided in the modules section above.
Contributing
Contributions are welcome! Please raise an issue or submit a pull request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Happy crawling! :)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file youtube_crawl-0.1-py3-none-any.whl
.
File metadata
- Download URL: youtube_crawl-0.1-py3-none-any.whl
- Upload date:
- Size: 15.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1ba3a8fe9e7fd670a3d7096927cbebb63cf113ac103b1848f3fec5fa9b7c2e15 |
|
MD5 | 3f3a22545794a2cd6912d3aa5b48c397 |
|
BLAKE2b-256 | 27a902f67bef5ba15466cf18d36732e8abdbc1384bb4c8e1ddcf720dedebc082 |