Download YouTube metadata for videos relating to a search query
Project description
Download YouTube metadata for videos relating to a search query
This is a Python script that can download metadata (including comments and likes) for YouTube videos relating to a search query. Uses the Youtube Data API v3. Metadata is saved in a PostgreSQL database.
TODO If you use metatube for scientific research, please cite it in your publication:
Fink, C. (2020): metatube: Python script to download YouTube metadata. doi:10.5281/zenodo.3773303.
Dependencies
The script is written in Python 3 and depends on the Python modules dateparser, psycopg2, PyYaml and Requests.
To install dependencies on a Debian-based system, run:
apt-get update -y &&
apt-get install -y python3-dev python3-pip python3-virtualenv
(There’s an Archlinux AUR package pulling in all dependencies, see further down)
Installation
- using
pip
or similar:
pip3 install metatube
-
OR: manually:
- Clone this repository
git clone https://gitlab.com/helics-lab/metatube.git
- Change to the cloned directory
- Use the Python
setuptools
to install the package:
cd metatube python3 ./setup.py install
-
OR: (Arch Linux only) from AUR:
# e.g. using yay
yay python-metatube
Configuration
Copy the example configuration file metatube.yml.example to a suitable location, depending on your operating system:
- on Linux systems:
- system-wide configuration:
/etc/metatube.yml
- per-user configuration:
~/.config/metatube.yml
OR${XDG_CONFIG_HOME}/metatube.yml
- system-wide configuration:
- on MacOS systems:
- per-user configuration:
${XDG_CONFIG_HOME}/metatube.yml
- per-user configuration:
- on Microsoft Windows systems:
- per-user configuration:
%APPDATA%\metatube.yml
- per-user configuration:
Adapt the configuration:
- Configure a PostgreSQL connection string (
connection_string
), pointing to an existing database - Configure an API access key to the Youtube Data API v3 (
youtube_api_key
). - Define search terms (
search_terms
)
All of these configuration options can alternatively be supplied as command line arguments to metatube
(see Usage) or as a config
dict
directly to the constructor of YoutubeVideoMetadataDownloader
. Command line options (see metatube --help
) or config
dict
both override config file.
Usage
Command line executable
metatube \
--postgresql-connection-string "dbname=metatube" \
--youtube-api-key "abcdefghijklmn" \
"how to build a tallbike"
Python
Import the metatube
module. Instantiate a YoutubeVideoMetadataDownloader
, optionally supply a config
dictionary. Then run the instance’s download()
method.
import metatube
# config from config file
downloader = YoutubeVideoDownloader()
downloader.download()
# config from config file,
# overriding `search_terms`
downloader = YoutubeVideoDownloader({
"search_terms": "Critical Mass Vladivostok"
})
downloader.download()
# entire config from dictionary
downloader = YoutubeVideoDownloader({
"youtube_api_key": "opqrstuvwxyz",
"connection_string": "dbname=metatube host=server1 user=bicyclelover123",
"search_terms": "dashcam bicycle commute albuquerque"
})
downloader.download()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.