Archives YouTube channels by automatically uploading their videos to archive.org
Project description
Internetarchive-YouTube
🚀 GitHub Action and CLI to archive YouTube channels by uploading the channel's videos to archive.org.
- 🧑💻 To use this tool as a command line interface (CLI), jump to CLI: Getting Started.
- ⚡️ To use this tool as a GitHub Action, jump to GitHub Action: Getting Started.
CLI: Getting Started 🧑💻
Requirements:
⬇️ Installation:
pip install internetarchive-youtube
🗃️ Backend database:
-
Create a backend database (or JSON bin) to track the download/upload overall progress.
-
If you picked option 1 (MongoDB), export MongoDB connection string as an environment variable:
export MONGODB_CONNECTION_STRING=mongodb://username:password@host:port
- If you picked option 2 (JSON bin), export JSONBIN master key as an environment variable:
export JSONBIN_KEY=xxxxxxxxxxxxxxxxx
⌨️ Usage:
usage: ia-yt [-h] [-p PRIORITIZE] [-s SKIP_LIST] [-f] [-t TIMEOUT] [-n] [-a] [-c CHANNELS_FILE] [-S] [-C]
optional arguments:
-h, --help show this help message and exit
-p PRIORITIZE, --prioritize PRIORITIZE
Comma-separated list of channel names to prioritize when processing videos
-s SKIP_LIST, --skip-list SKIP_LIST
Comma-separated list of channel names to skip
-f, --force-refresh Refresh the database after every video (Can slow down the workflow significantly, but is useful when running multiple concurrent
jobs
-t TIMEOUT, --timeout TIMEOUT
Kill the job after n hours (default: 5.5)
-n, --no-logs Don't print any log messages
-a, --add-channel Add a channel interactively to the list of channels to archive
-c CHANNELS_FILE, --channels-file CHANNELS_FILE
Path to the channels list file to use if the environment variable `CHANNELS` is not set (default: ~/.yt_channels.txt)
-S, --show-channels Show the list of channels in the channels file
-C, --create-collection
Creates/appends to the backend database from the channels list
GitHub Action: Getting Started ⚡️
- Fork this repository.
- Create a backend database (or JSON bin).
- Add your Archive.org credentials to the repository's Actions secrets:
ARCHIVE_USER_EMAIL
ARCHIVE_PASSWORD
- Add a list of the channels you want to archive to the repository's Actions secrets:
The CHANNELS
secret should be formatted like this example:
CHANNEL_NAME: CHANNEL_URL
FOO: CHANNEL_URL
FOOBAR: CHANNEL_URL
SOME_CHANNEL: CHANNEL_URL
Don't add any quotes around the name or the URL, and make sure to keep one space between the colon and the URL.
- Add the database secret(s) to the repository's Actions secrets:
If you picked option 1 (MongoDB), add this additional secret:
MONGODB_CONNECTION_STRING
If you picked option 2 (JSON bin), add this additional secret:
JSONBIN_KEY
- Run the workflow under
Actions
manually with aworkflow_dispatch
, or wait for it to run automatically.
That's it!
🏗️ Creating A Backend Database
- Option 1: MongoDB (recommended).
- Self-hosted (see: Alyetama/quick-MongoDB or dockerhub image).
- Free database on Atlas.
- Option 2: JSON bin (if you want a quick start).
- Sign up to JSONBin here.
- Click on
VIEW MASTER KEY
, then copy the key.
📝 Notes
- Information about the
MONGODB_CONNECTION_STRING
can be found here. - Jobs can run for a maximum of 6 hours, so if you're archiving a large channel, the job might die, but it will resume in a new job when it's scheduled to run.
- Instead of raw text, you can pass a file path or a file URL with a list of channels formatted as
CHANNEL_NAME: CHANNEL_URL
or in JSON format{"CHANNEL_NAME": "CHANNEL_URL"}
.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for internetarchive-youtube-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5832fd9c2e66713a83166b7939e895efd694bcd3ba5353aacd81521bdb98bbba |
|
MD5 | d5b955e3cfcbab18a180c7c62d0c5a6f |
|
BLAKE2b-256 | 227d22e57a3ae878e189cf68bc3d296abca5b189a1d57cfe4bf928864aab4253 |
Hashes for internetarchive_youtube-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0844b191c2f87c1c5ed0f394ac30df67eb07d5bd284aff5ec76fb24304eb6c8b |
|
MD5 | 91973831958a02a67bb2a7b84a9c8216 |
|
BLAKE2b-256 | 84859aeb5d19ea8e289df62cee96946235fd1eb873e3b0a77b08ef1b04cc76ab |