A powerful, asynchronous web crawler designed specifically for YouTube.
Project description
Youcreep
A powerful, asynchronous web crawler designed specifically for YouTube. Built to efficiently interact with web pages in a headless mode, this tool facilitates the seamless extraction of YouTube video comments and video information directly from the platform.
Features
- Asynchronous Crawling: Uses Python's asynchronous features for fast and efficient data extraction.
- Headless Interaction: Operates seamlessly without opening a web browser.
- Modular Design: Built with modularity in mind, allowing for easy extension and maintenance.
Modules
1. YoutubeCommentCrawler
Extract comments from any given YouTube video.
Usage:
from youcreep import YoutubeCommentCrawler
with YoutubeCommentCrawler() as crawler:
comments = await crawler.crawl('YOUR_YOUTUBE_VIDEO_URL', n_target=NUMBER_OF_COMMENTS_TO_FETCH)
2. YoutubeVideoInfoCrawler
Search for videos based on a keyword and extract their information.
Usage:
from youcreep import YoutubeVideoInfoCrawler
with YoutubeVideoInfoCrawler() as crawler:
videos_info = await crawler.crawl('YOUR_SEARCH_TERM', n_target=NUMBER_OF_VIDEOS_TO_FETCH,
filter_options=YOUR_FILTER_OPTIONS)
Getting Started
-
Clone the repository
git clone https://github.com/stevieflyer/youtube_crawler.git
-
Navigate to the project directory
cd youcreep
-
Install the dependencies
pip install -r requirements.txt
-
Start crawling
Use the examples provided in the modules section above.
Contributing
Contributions are welcome! Please raise an issue or submit a pull request.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Happy crawling! :)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for youcreep-0.1.13-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | dec2993cd6a67fc5ae8846503c172b6b11f7b8d3bfedc40a4780cbd2e00b7a1d |
|
MD5 | a986f2f65cead820f5478bace0b9dc9e |
|
BLAKE2b-256 | 372294a530f61c5973d7af3f7cd117b11b2c7ebd08d4a548bf80bf34c06a356b |