Script to download and segment youtube videos automatically.
Project description
YTCompDL
Command-line program to download and segment Youtube videos automatically.
Getting Started
Getting a YouTube Data API Key
Follow these instructions.
Store your API key in a .env
file in the main working directory.
Setup
venv
# Make sure ffmpeg is installed.
sudo apt install ffmpeg
virtualenv venv
source venv/bin/activate
ytcompdl -h
Conda
# Setup env.
conda env create -f envs/env.yaml -n ytcompdl
conda activate ytcompdl
ytcompdl -h
Docker
ffmpeg
comes installed with the docker image.
Arguments are passed after the image name.
# Image wd set to /ytcompdl
docker run --rm -v /$PWD:/ytcompdl koisland/ytcompdl:latest -h
To build the image locally.
docker build . -t ytcompdl:latest
Usage
# Download audio of video.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" -k .env -o "audio" -x config/config_regex.yaml
# Download split audio of video and save comment/desc used to timestamp.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" \
-k .env \
-o "audio" \
-x config/config_regex.yaml \
-t -s
Options
usage: main.py [-h] -u URL -o OUTPUT_TYPE -x REGEX_CFG [-d DIRECTORY] [-n N_CORES] [-r RESOLUTION] [-m METADATA] [-c] [-t] [-s] [-f FADE] [-ft FADE_TIME]
Command-line program to download and segment Youtube videos.
options:
-h, --help show this help message and exit
-u URL, --url URL Youtube URL
-o OUTPUT_TYPE, --output_type OUTPUT_TYPE
Desired output (audio/video)
-x REGEX_CFG, --regex_cfg REGEX_CFG
Path to regex config file (.yaml)
-d DIRECTORY, --directory DIRECTORY
Output directory.
-n N_CORES, --n_cores N_CORES
Use n cores to process tracks in parallel.
-r RESOLUTION, --resolution RESOLUTION
Desired resolution (video only)
-m METADATA, --metadata METADATA
Path to optional metadata (.json)
-c, --comment Select comment.
-t, --timestamps Save timestamps as .txt file.
-s, --slice Slice output.
-f FADE, --fade FADE Fade (in/out/both/none)
-ft FADE_TIME, --fade_time FADE_TIME
Fade time in seconds.
Regular Expressions
To set your own regular expressions to search for in video comments/descriptions, modify config/config_regex.yaml
.
config/config_regex.yaml
ignored_spacers: # Optional
- "―"
- "―"
- "-"
- "\\s"
- "["
- "]"
time: "\\d{1,2}:?\\d*:\\d{2}" # Optional
# Required
start_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
duration_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
For some examples, check these patterns below:
Start
TimestampsDuration
Timestamps
Workflow
- Query YouTube's Data API for selected video.
- Search description and comments for timestamps ranked by similarity to video duration.
- Parse timestamps with regular expresions.
- Download video and/or audio streams from Youtube.
- Process streams.
- Merge or convert streams.
- Slice by found timestamps.
- Apply file metadata.
- Add audio and/or video fade.
- Cleanup
- Remove intermediate outputs.
Build from Source
virtualenv venv && source venv/bin/activate
python setup.py sdist bdist_wheel
ytcompdl -h
TO-DO:
- Testing
- Add unittests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ytcompdl-1.0.2.tar.gz
(18.4 kB
view details)
Built Distribution
ytcompdl-1.0.2-py3-none-any.whl
(18.6 kB
view details)
File details
Details for the file ytcompdl-1.0.2.tar.gz
.
File metadata
- Download URL: ytcompdl-1.0.2.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c7a633b04856872d95e4ac75fee4558b4a1b8803bd2b293d47b210774625410a |
|
MD5 | 205ca5536d36d4d7a385936d8498d2e4 |
|
BLAKE2b-256 | 78ec01dda724e132b6bf1b17c6042bb3967da11c90da61a1a228394e2265fc6f |
File details
Details for the file ytcompdl-1.0.2-py3-none-any.whl
.
File metadata
- Download URL: ytcompdl-1.0.2-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f381cf6148c59127dc95cf1da3e5b2ce99b04ae5eb8487c24ae994ff6a88e9df |
|
MD5 | 10339345ce6de64d44552ff6f192fdd6 |
|
BLAKE2b-256 | 1ad27c523fdac03ff896f3514d23421cbdac623884efad0be76408e8162675a6 |