Skip to main content

A powerful, asynchronous web crawler designed specifically for YouTube.

Project description

YouTube Crawler

A powerful, asynchronous web crawler designed specifically for YouTube. Built to efficiently interact with web pages in a headless mode, this tool facilitates the seamless extraction of YouTube video comments and video information directly from the platform.

Features

  • Asynchronous Crawling: Uses Python's asynchronous features for fast and efficient data extraction.
  • Headless Interaction: Operates seamlessly without opening a web browser.
  • Modular Design: Built with modularity in mind, allowing for easy extension and maintenance.

Modules

1. YoutubeCommentCrawler

Extract comments from any given YouTube video.

Usage:

from youtube_crawler import YoutubeCommentCrawler

with YoutubeCommentCrawler() as crawler:
    comments = await crawler.crawl('YOUR_YOUTUBE_VIDEO_URL', n_target=NUMBER_OF_COMMENTS_TO_FETCH)

2. YoutubeVideoInfoCrawler

Search for videos based on a keyword and extract their information.

Usage:

from youtube_crawler import YoutubeVideoInfoCrawler

with YoutubeVideoInfoCrawler() as crawler:
    videos_info = await crawler.crawl('YOUR_SEARCH_TERM', n_target=NUMBER_OF_VIDEOS_TO_FETCH, filter_options=YOUR_FILTER_OPTIONS)

Getting Started

  1. Clone the repository

    git clone https://github.com/stevieflyer/youtube_crawler.git
    
  2. Navigate to the project directory

    cd youtube_crawler
    
  3. Install the dependencies

    pip install -r requirements.txt
    
  4. Start crawling

    Use the examples provided in the modules section above.

Contributing

Contributions are welcome! Please raise an issue or submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Happy crawling! :)

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

youtube_crawl-0.1-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file youtube_crawl-0.1-py3-none-any.whl.

File metadata

  • Download URL: youtube_crawl-0.1-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for youtube_crawl-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1ba3a8fe9e7fd670a3d7096927cbebb63cf113ac103b1848f3fec5fa9b7c2e15
MD5 3f3a22545794a2cd6912d3aa5b48c397
BLAKE2b-256 27a902f67bef5ba15466cf18d36732e8abdbc1384bb4c8e1ddcf720dedebc082

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page