Script to download and segment youtube videos automatically.
Project description
YTCompDL, a Youtube Video Segmenter
Command-line program to download and segment Youtube videos automatically.
Getting Started
Getting a YouTube Data API Key
Follow these instructions.
Store your API key in a .env
file in the main working directory.
Conda
conda env create -f environment.yaml
conda activate YTCompDL
# Download audio of video.
python main.py -u "https://www.youtube.com/watch?v=gIsHl7swEgk" -o "audio" -x config/config_regex.yaml
# Download split audio of video and save comment/desc used to timestamp.
python main.py \
-u "https://www.youtube.com/watch?v=gIsHl7swEgk" \
-o "audio" \
-x config/config_regex.yaml \
-t -s
Options
usage: main.py [-h] -u URL -o OUTPUT_TYPE -x REGEX_CFG [-d DIRECTORY] [-n N_CORES] [-r RESOLUTION] [-m METADATA] [-c] [-t] [-s] [-f FADE] [-ft FADE_TIME]
Command-line program to download and segment Youtube videos.
options:
-h, --help show this help message and exit
-u URL, --url URL Youtube URL
-o OUTPUT_TYPE, --output_type OUTPUT_TYPE
Desired output (audio/video)
-x REGEX_CFG, --regex_cfg REGEX_CFG
Path to regex config file (.yaml)
-d DIRECTORY, --directory DIRECTORY
Output directory.
-n N_CORES, --n_cores N_CORES
Use n cores to process tracks in parallel.
-r RESOLUTION, --resolution RESOLUTION
Desired resolution (video only)
-m METADATA, --metadata METADATA
Path to optional metadata (.json)
-c, --comment Select comment.
-t, --timestamps Save timestamps as .txt file.
-s, --slice Slice output.
-f FADE, --fade FADE Fade (in/out/both/none)
-ft FADE_TIME, --fade_time FADE_TIME
Fade time in seconds.
Regular Expressions
To set your own regular expressions to search for in video comments/descriptions, modify config/config_regex.yaml
.
config/config_regex.yaml
ignored_spacers: # Optional
- "―"
- "―"
- "-"
- "\\s"
- "["
- "]"
time: "\\d{1,2}:?\\d*:\\d{2}" # Optional
# Required
start_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
duration_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
For some examples, check these patterns below:
Start
TimestampsDuration
Timestamps
Workflow
- Query YouTube's Data API for selected video.
- Search description and comments for timestamps ranked by similarity to video duration.
- Parse timestamps with regular expresions.
- Download video and/or audio streams from Youtube.
- Process streams.
- Merge or convert streams.
- Slice by found timestamps.
- Apply file metadata.
- Add audio and/or video fade.
- Cleanup
- Remove intermediate outputs.
Build from Source
virtualenv venv && source venv/Scripts/activate # source venv/bin/activate
python setup.py sdist bdist_wheel
ytcompdl -h
TO-DO:
- Testing
- Add more unittests.
- PyPi package.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ytcompdl-1.0.0.tar.gz
(17.3 kB
view hashes)
Built Distribution
ytcompdl-1.0.0-py3-none-any.whl
(18.0 kB
view hashes)