Skip to main content

Python package to extract YouTube video titles and corresponding URLs for a specific channel

Project description

Automate a Videos List Creation for a YouTube Channel

Overview

This repo is intended to provide a quick, simple way to create a list of all videos posted to any YouTube channel by providing just the URL to that user's channel videos. The general format for this is https://www.youtube.com/user/TheChannelYouWantToScrape/videos OR
https://www.youtube.com/channel/TheChannelYouWantToScrape/videos,
with TheChannelYouWantToScrape replaced with the username of the channel.

Quick Start

pip install -U yt-videos-list

Setting up Selenium dependencies for Mac

########################
# MacOS geckodriver (Mozilla Firefox) tar.gz file:
curl -SL https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-macos.tar.gz | tar -xzvf - -C /usr/local/bin

# Mac64 chromedriver (Google Chrome 78.0.3904.70) zip folder:
curl -SL https://chromedriver.storage.googleapis.com/78.0.3904.70/chromedriver_mac64.zip | tar -xzvf - -C /usr/local/bin

# Mac64 operadriver (Opera Browser 76.0.3809.132, 77.0.3865.120 had compatibility issues) tar.gz file:
curl -SL https://github.com/operasoftware/operachromiumdriver/releases/download/v.77.0.3865.120/operadriver_mac64.zip | tar -xzvf - -C /usr/local/bin --strip-components=1 && rm /usr/local/bin/sha512_sum 
########################

Setting up Selenium dependencies for Linux64

########################
# Linux64 geckodriver (Mozilla Firefox) tar.gz file:
curl -SL https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-linux64.tar.gz | tar -xzvf - -C /usr/local/bin/

# Linux64 chromedriver (Google Chrome 78.0.3904.70) zip folder:
curl -SL https://chromedriver.storage.googleapis.com/78.0.3904.70/chromedriver_linux64.zip | tar -xzvf - -C /usr/local/bin

# Linux64 operadriver (Opera Browser 76.0.3809.132, 77.0.3865.120 had compatibility issues) tar.gz file:
https://github.com/operasoftware/operachromiumdriver/releases/download/v.76.0.3809.132/operadriver_linux64.zip
########################

Setting up Selenium dependencies for Linux32

########################
# Linux 32 geckodriver (Mozilla Firefox) tar.gz file:
curl -SL https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-linux32.tar.gz | tar -xzvf - -C /usr/local/bin
########################

Setting up Selenium dependencies for Windows64

# Windows64 geckodriver (Mozilla Firefox) zip folder
curl -SL https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-win64.zip | tar -xzvf - -C /usr/local/bin

# Windows32 operadriver (Opera Browser 76.0.3809.132, 77.0.3865.120 had compatibility issues) tar.gz file:
curl -SL https://github.com/operasoftware/operachromiumdriver/releases/download/v.76.0.3809.132/operadriver_win64.zip | tar -xzvf 

Setting up Selenium dependencies for Windows32

# Windows32 geckodriver (Mozilla Firefox) zip folder
curl -SL https://github.com/mozilla/geckodriver/releases/download/v0.26.0/geckodriver-v0.26.0-win32.zip | tar -xzvf - -C /usr/local/bin

# Windows32 chromedriver (Google Chrome 78.0.3904.70) zip folder:
curl -SL https://chromedriver.storage.googleapis.com/78.0.3904.70/chromedriver_win32.zip | tar -xzvf - -C /usr/local/bin

# Windows32 operadriver (Opera Browser 76.0.3809.132, 77.0.3865.120 had compatibility issues) tar.gz file:
curl -SL https://github.com/operasoftware/operachromiumdriver/releases/download/v.76.0.3809.132/operadriver_win32.zip

Running the module

python3
from yt_videos_list import ListGenerator
LG = ListGenerator()
LG.generate_list(channel='schafer5', channelType='user', fileName='CoreySchafer_ProgrammingTutorials')
LG.generate_list(channel='UC8butISFwT-Wl7EV0hUK0BQ', channelType='channel', fileName='freeCodeCamp.org')

user channelType (example uses sentdex):

LG.generate_list(channel='sentdex', channelType='user')

channel channelType (example uses Billie Eilish):

LG.generate_list(channel='UCiGm_E4ZwYSHV3bcW1pnSeQ', channelType='channel')

Naming the output file

In order to get a more descriptive file name, add how you would like to describe the file for the (optional) third argument (fileName):

LG.generate_list(channel='UCiGm_E4ZwYSHV3bcW1pnSeQ', channelType='channel', fileName='BillieEilish')

Understanding the API

There are two types of YouTube channels: one type is a user channel and the other is a channel channel.

To scrape the video titles along with the link to the video, you need to run the generate_list(channel, channelType) method on the ListGenerator object you just created, substituting the type of channel for channelType argument and the name of the channel for the channel argument. By default, the name of the file produced will be channelVideosList.ext where the .ext will be .csv or .txt depending on the type of file(s) that you specified.

For more control:

ListGenerator(csv=True, csvWriteFormat='x', txt=True, txtWriteFormat='x', docx=False,
              docxWriteFormat='x', chronological=True,
              headless=False, scrollPauseTime=0.7, browser='Firefox')

There are a number of optional arguments you can specify during the instantiation of the ListGenerator object. The preceding arguments are run by default, but in case you want more flexibility, you can specify:

  • Options for the browser arguments are
    • Firefox (default)
    • Chrome
    • Opera
    • Safari
  • Options for the file type arguments are
    • True (default) - create a file for the specified type
    • False - do not create a file for the specified type.
      • txt=True (default) OR txt=False
      • csv=True (default) OR csv=False
      • docx=True (unsupported) OR docx=False
  • Options for the write formats are
    • 'x' (default) - does not overwrite an existing file with the same name
    • 'w' - if an existing file with the same name exists, it will be overwritten
    • NOTE: if you specify the file type argument to be False, you don't need to touch this - the program will automatically skip this step.
      • txtWriteFormat='x' (default) OR txtWriteFormat='w'
      • csvWriteFormat='x' (default) OR csvWriteFormat='w'
      • docxWriteFormat='x' (unsupported) OR docxWriteFormat='w'
  • Options for the chronological argument are
    • True (this is the only chronological option currently supported right now :D) - write the files in order from oldest videos to most recent
    • False (currently UNSUPPORTED!) - write the files in order from most recent to oldest.
      • chronological=True (default) OR chronological=False
  • Options for the headless option are
    • False (default) - run the browser with an open Selenium instance for viewing
    • True - run the browser in "invisible" mode.
      • headless=False (default) OR headless=True
  • Options for the scrollPauseTime argument are any float values greater than 0 (default 0.8). The value you provide will be how long the program waits before trying to scroll the videos list page down for the channel you want to scrape. For fast internet connections, you may want to reduce the value, and for slow connections you may want to increase the value.
    • scrollPauseTime=0.8 (default)
    • CAUTION: reducing this value too much will result in the programming not capturing all the videos, so be careful! Experiment :)

Running as a script (coming in 0.2.x!)

Following is deprecated... Enter the directory in which the pyYT_videos_list.py and execute.py exist (they should both be in the same directory to avoid refernce issues), and run the following command from your command line

python3 yt_videos_list

You should see the following:

What is the name of the YouTube channel you want to generate the list for?
If you're unsure, click on the channel and look at the URL.
It should be in the format:
https://www.youtube.com/user/YourChannelName
OR
https://www.youtube.com/channel/YourChannelName
Substitute what you see for YourChannelName and type it in below:

Enter the name of the channel or user that you wish to scrape, and the program will do the rest for you!

Future Features

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yt_videos_list-0.2.4.tar.gz (12.7 kB view details)

Uploaded Source

File details

Details for the file yt_videos_list-0.2.4.tar.gz.

File metadata

  • Download URL: yt_videos_list-0.2.4.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.15.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.4

File hashes

Hashes for yt_videos_list-0.2.4.tar.gz
Algorithm Hash digest
SHA256 c8171582664207426e2681338b437bcddfbc1c9a629ce94de7ec7f564e068103
MD5 d99612f1f6f1846671e61985711191a5
BLAKE2b-256 45b33a27bb5eafaef0d793f47ef3b53d49e7f09aefc1177f7d99559e3a3dd140

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page