Skip to main content

Command line utility querying the YouTube API v3.

Project description

YouTube Scraper

A simple command utility to extract information from the YouTube API v3 for scientific purposes.

made-with-python GitHub license version

About

This Python based command line utility enables the easy extraction of information from the YouTube API (Version 3). Currently, it only supports a small subset of functions of the API interface and focuses on extracting related videos from a given starting point.

Installation

Install yt-scraper by using pip:

sudo -H pip install yt-scraper

Update by adding the --upgrade flag:

sudo -H pip install --upgrade yt-scraper

Usage

Commands

Currently, there is only one command supported by yt-scraper: search

search

The search command starts a video search from a given starting point, such as a search term or a video itself.

For example the following command will return the first video when one searches for cat.

$ yt-scraper search term 'cat'

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI',))

One can also provide a video id or a video url as a starting point, which is more interesting when used with the --depth option:

$ yt-scraper search id '0A2R27kCeD4' --depth 2

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI',))
VideoNode(videoId='XewbmK0kmpI', depth=1, rank=0, relatedVideos=('hJpfROXlaPc',))
VideoNode(videoId='hJpfROXlaPc', depth=2, rank=0, relatedVideos=('dElQqMWhDgA',))

Additionally, one can specify the number of videos that should be returned on each level by using the --number option. For example the following command returns two related videos from a given video (specified by it's url) and then from each sibling only one related video:

$ yt-scraper search url 'https://www.youtube.com/watch?v=0A2R27kCeD4' --depth 1 --number 2 -number 1

VideoNode(videoId='0A2R27kCeD4', depth=0, rank=0, relatedVideos=('XewbmK0kmpI', 'U5KLMeFK_UY'))
VideoNode(videoId='XewbmK0kmpI', depth=1, rank=0, relatedVideos=('hJpfROXlaPc',))
VideoNode(videoId='U5KLMeFK_UY', depth=1, rank=1, relatedVideos=('nFrb-C6I6Ps',))

For the sake of brevity, you can shorten --number to -n and --depth to -d.

Options
Search options Default Description
-n, --number 1 Number of the videos fetched per level. Can be specified multiple times for each level.
-d, --depth 0 Number of recursion steps to perform.
-k, --api-key Required The API key that should be used to query the YouTube API v3.

Global Options

Global options are specified before the command. For example, to get more output during the program execution, specify -v right after yt-scraper:

$ yt-scraper -v search term 'cat'

All global options:

Global options Default Description
-c, --config-path System-specific Specifies a configuration file. For details, see configuration.
-v, --verbose False Shows more output during program execution.

Configuration

Instead of repeatedly passing the same options to yt-scraper, one can specify these options in a config.toml file. These values will be used in all future queries as long as they are not get overwritten by actual command line options.

For example, to always use the API key ABCDEFGH and a search depth of 3, where on each level one video less is returned, just create following configuration file:

config.toml

api_key = "ABCDEFGH"
number = [ 4, 3, 2, 1 ]
depth = 3

A example toml is included: config.toml.example

Then put this file in your standard configuration folder. Typically this folder can be found at the following system-specific locations:

  • Mac OS X: ~/Library/Application Support/YouTube Scraper
  • Unix: ~/.config/youtube-scraper
  • Windows: C:\Users\<user>\AppData\Roaming\YouTube Scraper

If the folder does not exist, you may need to create it.

Release History

  • 0.2.6
  • 0.3.0
  • 0.4.0
    • Command search released

Roadmap

Every of these features is going to be a minor patch:

  • Add node video data attributes, such as title and description.
  • Add possibility to specify more than one API key to switch seamlessly.
  • Add possibility to query more than 50 videos on one level.
  • Add youtube-dl integration for downloading subtitles.
  • Add a testing suite.
  • Add export functionality to SQLlite or Pandas.

Contributing

If you found a bug or have a suggestion, please don't hesitate to file an issue.

Contributions in any form are welcomed. I will accept pull-requests if they extent yt-scraper's functionality.

To set up the development environment, please install Poetry and run poetry install inside the project. A test suite will be added soon.

In general, the contribution process is somehow like this:

  1. Fork it ($ git clone https://github.com/rattletat/yt-scraper)
  2. Create your feature branch ($ git checkout -b feature/fooBar)
  3. Commit your changes ($ git commit -am 'Add some fooBar')
  4. Push to the branch ($ git push origin feature/fooBar)
  5. Create a new Pull Request

Author

Michael Brauweiler

License

This plugin is free and unemcumbered software released into the public domain.

For more information, see the included UNLICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt-scraper-0.4.3.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

yt_scraper-0.4.3-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file yt-scraper-0.4.3.tar.gz.

File metadata

  • Download URL: yt-scraper-0.4.3.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Linux/5.5.13-arch1-1

File hashes

Hashes for yt-scraper-0.4.3.tar.gz
Algorithm Hash digest
SHA256 4225822a9f77b74850ec94a47dd67931033143f7e08b53b4d33d06fc95fd9e52
MD5 9433faf077b5c3c2d6e2284a277e409d
BLAKE2b-256 1a44ebd9e3ed94d72fbc72e9f7c639d02a058f0fa71d2795d96a3eefd4dbe5b5

See more details on using hashes here.

File details

Details for the file yt_scraper-0.4.3-py3-none-any.whl.

File metadata

  • Download URL: yt_scraper-0.4.3-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.0.5 CPython/3.8.2 Linux/5.5.13-arch1-1

File hashes

Hashes for yt_scraper-0.4.3-py3-none-any.whl
Algorithm Hash digest
SHA256 2db83657958811096cdd31968c137b6702e3225b66c3bb77d5a00904048e01ce
MD5 fbad62ba56ee7c149bae098eea8917f1
BLAKE2b-256 601398577f20b49cfb96e0c7ee2a28252f4be995f6b7b18b5545738be193074b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page