Skip to main content

Command line utility querying the YouTube API v3.

Project description

YouTube Scraper

A simple command utility to extract information from the YouTube API v3 for scientific purposes.

made-with-python License: Unlicense version

About

This Python based command line utility enables the easy extraction of information from the YouTube API (Version 3). Currently, it only supports a small subset of functions of the API interface and focuses on extracting related videos from a given starting point.

Installation

Install yt-scraper by using pip:

sudo -H pip install yt-scraper

Update by adding the --upgrade flag:

sudo -H pip install --upgrade yt-scraper

Usage

Commands

Currently, there is only one command supported by yt-scraper: search

search

The search command starts a video search from a given starting point, such as a search term or a video itself.

For example the following command will return the first video when one searches for cat.

$ yt-scraper search term 'cat'

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI',))

One can also provide a video id or a video url as a starting point, which is more interesting when used with the --depth option:

$ yt-scraper search id '0A2R27kCeD4' --depth 2

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI',))
VideoNode(videoId='XewbmK0kmpI', depth=1, rank=0, relatedVideos=('hJpfROXlaPc',))
VideoNode(videoId='hJpfROXlaPc', depth=2, rank=0, relatedVideos=('dElQqMWhDgA',))

Additionally, one can specify the number of videos that should be returned on each level by using the --number option. For example the following command returns two related videos from a given video (specified by it's url) and then from each sibling only one related video:

$ yt-scraper search url 'https://www.youtube.com/watch?v=0A2R27kCeD4' --depth 1 --number 2 -number 1

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI', 'U5KLMeFK_UY'))
VideoNode(videoId='XewbmK0kmpI', depth=1, rank=0, relatedVideos=('hJpfROXlaPc',))
VideoNode(videoId='U5KLMeFK_UY', depth=1, rank=1, relatedVideos=('nFrb-C6I6Ps',))

For the sake of brevity, you can shorten --number to -n and --depth to -d.

Options
Search options Default Description
-n, --number 1 Number of the videos fetched per level. Can be specified multiple times for each level.
-d, --max-depth 0 Number of recursion steps to perform.
-k, --api-key Required The API key that should be used to query the YouTube API v3.

Global Options

Global options are specified before the command. For example, to get more output during the program execution, specify -v right after yt-scraper:

$ yt-scraper -v search term 'cat'

All global options:

Global options Default Description
-c, --config-path System-specific Specifies a configuration file. For details, see configuration.
-v, --verbose False Shows more output during program execution.

Configuration

Instead of repeatedly passing the same options to yt-scraper, one can specify these options in a config.toml file. These values will be used in all future queries as long as they are not get overwritten by actual command line options.

For example, to always use the API key ABCDEFGH and a search depth of 3, where on each level one video less is returned, just create following configuration file:

config.toml

api_key = "ABCDEFGH"
number = [ 4, 3, 2, 1 ]
depth = 3
verbose = true

A example toml is included: config.toml

Then put this file in your standard configuration folder. Typically this folder can be found at the following system-specific locations:

  • Mac OS X: ~/Library/Application Support/YouTube Scraper
  • Unix: ~/.config/youtube-scraper
  • Windows: C:\Users\<user>\AppData\Roaming\YouTube Scraper

If the folder does not exist, you may need to create it.

Release History

  • 0.2.6
  • 0.3.0
  • 0.4.0
    • New command search
  • 0.5.0
    • Option --depth renamed to --max-depth
    • Video attributes, such as title, description, channel are fetched.
    • More consistent option handling
  • 0.6.0
    • New export feature: csv
    • New command: config
    • New API options: region-code, lang-code and safe-search

Roadmap

Every of these features is going to be a minor patch:

  • Add node video data attributes, such as title and description.
  • Add possibility to specify more than one API key to switch seamlessly.
  • Add possibility to query more than 50 videos on one level.
  • Add youtube-dl integration for downloading subtitles.
  • Add a testing suite.
  • [o] Add export functionality to CSV, SQLlite or Pandas.

Contributing

If you found a bug or have a suggestion, please don't hesitate to file an issue.

Contributions in any form are welcomed. I will accept pull-requests if they extent yt-scraper's functionality.

To set up the development environment, please install Poetry and run poetry install inside the project. A test suite will be added soon.

In general, the contribution process is somehow like this:

  1. Fork it ($ git clone https://github.com/rattletat/yt-scraper)
  2. Create your feature branch ($ git checkout -b feature/fooBar)
  3. Commit your changes ($ git commit -am 'Add some fooBar')
  4. Push to the branch ($ git push origin feature/fooBar)
  5. Create a new Pull Request

Author

Michael Brauweiler

License

This plugin is free and unemcumbered software released into the public domain.

For more information, see the included UNLICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt-scraper-0.6.3.tar.gz (13.5 kB view hashes)

Uploaded Source

Built Distribution

yt_scraper-0.6.3-py3-none-any.whl (12.8 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page