Clark University, Package for YouTube crawler and cleaning data
Project description
clarku-youtube-crawler
Version 0.0.1->0.0.3
This is beta without testing since python packaging is a pain. Please don't install these versions.
Version 0.0.5
Finally figured out testing. It works okay. More documentation to come. To install:
pip install clarku-youtube-crawler
Version 0.0.7.DEV
Not sure if stable release. Finished all features.
Example usage
First, run only import to generate config.ini
from clarku_youtube_crawler import *
or
from clarku_youtube_crawler import RawCrawler, ChannelCrawler, JSONConverter
After running import, go to config.ini
to configure file paths. Make sure DEVELOPER_KEY.txt
(or if the filename differs, configure also in config.ini
) is in the same folder. Then run:
test = RawCrawler.RawCrawler()
test.__build__()
test.crawl("food",start_date=1, start_month=12, start_year=2020, day_count=1)
test.crawl_videos_in_list(comment_page_count=1)
test.merge_all()
channel = ChannelCrawler.ChannelCrawler()
channel.__build__()
channel.setup_channel(subscriber_cutoff=1, keyword="")
channel.crawl()
channel.crawl_videos_in_list(comment_page_count=1)
channel.merge_all()
jsonn = JSONConverter.JSONConverter()
jsonn.load_json("FINAL_channel_merged.json")
If missing requirements (I already include all dependencies so it shouldn't happen), download requirements.txt
here on this repo
and run
$ pip install -r requirements.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for clarku_youtube_crawler-0.0.7.dev0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e49a94fcd2221ef0426c0a0a5c046f86c71ada95870b0a71a1a48a60ba92d99d |
|
MD5 | 8701552359417454f4f02e85c2a9f03b |
|
BLAKE2b-256 | d986d3c3fae424ffef85b395f8297b43ffe0017296ca864c48522d0912bd9e54 |
Hashes for clarku_youtube_crawler-0.0.7.dev0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4079aab41183cc2bc104f6d8acb40cfc41501975ba08509a6eadfde5d6184e2d |
|
MD5 | e84616e1a5fcf5532c9479ffd6d2d8c5 |
|
BLAKE2b-256 | c5f4eee1921d16643b1d8c68dd75394903218d3358c2d347f83517416faad4d3 |