Script to download and segment youtube videos automatically.
Project description
YTCompDL
Command-line program to download and segment Youtube videos automatically.
Getting Started
Getting a YouTube Data API Key
Follow these instructions.
Store your API key in a .env file in the main working directory.
Setup
venv
# Make sure ffmpeg is installed.
sudo apt install ffmpeg
virtualenv venv
source venv/bin/activate
ytcompdl -h
Conda
# Setup env.
conda env create -f envs/env.yaml -n ytcompdl
conda activate ytcompdl
ytcompdl -h
Docker
ffmpeg comes installed with the docker image.
Arguments are passed after the image name.
# Image wd set to /ytcompdl
docker run --rm -v /$PWD:/ytcompdl koisland/ytcompdl:latest -h
To build the image locally.
docker build . -t ytcompdl:latest
Usage
# Download audio of video.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" -k .env -o "audio" -x config/config_regex.yaml
# Download split audio of video and save comment/desc used to timestamp.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" \
-k .env \
-o "audio" \
-x config/config_regex.yaml \
-t -s
Options
usage: main.py [-h] -u URL -o OUTPUT_TYPE -x REGEX_CFG [-d DIRECTORY] [-n N_CORES] [-r RESOLUTION] [-m METADATA] [-c] [-t] [-s] [-f FADE] [-ft FADE_TIME]
Command-line program to download and segment Youtube videos.
options:
-h, --help show this help message and exit
-u URL, --url URL Youtube URL
-o OUTPUT_TYPE, --output_type OUTPUT_TYPE
Desired output (audio/video)
-x REGEX_CFG, --regex_cfg REGEX_CFG
Path to regex config file (.yaml)
-d DIRECTORY, --directory DIRECTORY
Output directory.
-n N_CORES, --n_cores N_CORES
Use n cores to process tracks in parallel.
-r RESOLUTION, --resolution RESOLUTION
Desired resolution (video only)
-m METADATA, --metadata METADATA
Path to optional metadata (.json)
-c, --comment Select comment.
-t, --timestamps Save timestamps as .txt file.
-s, --slice Slice output.
-f FADE, --fade FADE Fade (in/out/both/none)
-ft FADE_TIME, --fade_time FADE_TIME
Fade time in seconds.
Regular Expressions
To set your own regular expressions to search for in video comments/descriptions, modify config/config_regex.yaml.
config/config_regex.yaml
ignored_spacers: # Optional
- "―"
- "―"
- "-"
- "\\s"
- "["
- "]"
time: "\\d{1,2}:?\\d*:\\d{2}" # Optional
# Required
start_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
duration_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
For some examples, check these patterns below:
StartTimestampsDurationTimestamps
Workflow
- Query YouTube's Data API for selected video.
- Search description and comments for timestamps ranked by similarity to video duration.
- Parse timestamps with regular expresions.
- Download video and/or audio streams from Youtube.
- Process streams.
- Merge or convert streams.
- Slice by found timestamps.
- Apply file metadata.
- Add audio and/or video fade.
- Cleanup
- Remove intermediate outputs.
Build from Source
virtualenv venv && source venv/bin/activate
python setup.py sdist bdist_wheel
ytcompdl -h
TO-DO:
- Testing
- Add unittests.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ytcompdl-1.0.2.tar.gz.
File metadata
- Download URL: ytcompdl-1.0.2.tar.gz
- Upload date:
- Size: 18.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c7a633b04856872d95e4ac75fee4558b4a1b8803bd2b293d47b210774625410a
|
|
| MD5 |
205ca5536d36d4d7a385936d8498d2e4
|
|
| BLAKE2b-256 |
78ec01dda724e132b6bf1b17c6042bb3967da11c90da61a1a228394e2265fc6f
|
File details
Details for the file ytcompdl-1.0.2-py3-none-any.whl.
File metadata
- Download URL: ytcompdl-1.0.2-py3-none-any.whl
- Upload date:
- Size: 18.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f381cf6148c59127dc95cf1da3e5b2ce99b04ae5eb8487c24ae994ff6a88e9df
|
|
| MD5 |
10339345ce6de64d44552ff6f192fdd6
|
|
| BLAKE2b-256 |
1ad27c523fdac03ff896f3514d23421cbdac623884efad0be76408e8162675a6
|