Skip to main content

Script to download and segment youtube videos automatically.

Project description

YTCompDL

PyPI Docker Image Version (tag latest semver)

Command-line program to download and segment Youtube videos automatically.

Getting Started


Getting a YouTube Data API Key

Follow these instructions.

Store your API key in a .env file in the main working directory.

Setup

venv

# Make sure ffmpeg is installed.
sudo apt install ffmpeg
virtualenv venv
source venv/bin/activate
ytcompdl -h

Conda

# Setup env.
conda env create -f envs/env.yaml -n ytcompdl
conda activate ytcompdl
ytcompdl -h

Docker

ffmpeg comes installed with the docker image.

Arguments are passed after the image name.

# Image wd set to /ytcompdl
docker run --rm -v /$PWD:/ytcompdl koisland/ytcompdl:latest -h

To build the image locally.

docker build . -t ytcompdl:latest

Usage

# Download audio of video.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" -k .env -o "audio" -x config/config_regex.yaml

# Download split audio of video and save comment/desc used to timestamp.
ytcompdl -u "https://www.youtube.com/watch?v=gIsHl7swEgk" \
  -k .env \
  -o "audio" \
  -x config/config_regex.yaml \
  -t -s

Options


usage: main.py [-h] -u URL -o OUTPUT_TYPE -x REGEX_CFG [-d DIRECTORY] [-n N_CORES] [-r RESOLUTION] [-m METADATA] [-c] [-t] [-s] [-f FADE] [-ft FADE_TIME]

Command-line program to download and segment Youtube videos.

options:
  -h, --help            show this help message and exit
  -u URL, --url URL     Youtube URL
  -o OUTPUT_TYPE, --output_type OUTPUT_TYPE
                        Desired output (audio/video)
  -x REGEX_CFG, --regex_cfg REGEX_CFG
                        Path to regex config file (.yaml)
  -d DIRECTORY, --directory DIRECTORY
                        Output directory.
  -n N_CORES, --n_cores N_CORES
                        Use n cores to process tracks in parallel.
  -r RESOLUTION, --resolution RESOLUTION
                        Desired resolution (video only)
  -m METADATA, --metadata METADATA
                        Path to optional metadata (.json)
  -c, --comment         Select comment.
  -t, --timestamps      Save timestamps as .txt file.
  -s, --slice           Slice output.
  -f FADE, --fade FADE  Fade (in/out/both/none)
  -ft FADE_TIME, --fade_time FADE_TIME
                        Fade time in seconds.

Regular Expressions

To set your own regular expressions to search for in video comments/descriptions, modify config/config_regex.yaml.

config/config_regex.yaml

ignored_spacers: # Optional
  - "―"
  - "―"
  - "-"
  - "\\s"
  - "["
  - "]"

time: "\\d{1,2}:?\\d*:\\d{2}" # Optional

# Required
start_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"
duration_timestamp: "(.*?)(?:{ignored_spacers})*({time})(?:{ignored_spacers})*({time})(?:{ignored_spacers})*(.*)"

For some examples, check these patterns below:

  • Start Timestamps
  • Duration Timestamps

Workflow


  • Query YouTube's Data API for selected video.
  • Search description and comments for timestamps ranked by similarity to video duration.
  • Parse timestamps with regular expresions.
  • Download video and/or audio streams from Youtube.
  • Process streams.
    • Merge or convert streams.
    • Slice by found timestamps.
    • Apply file metadata.
    • Add audio and/or video fade.
  • Cleanup
    • Remove intermediate outputs.

Build from Source

virtualenv venv && source venv/bin/activate
python setup.py sdist bdist_wheel
ytcompdl -h

TO-DO:


  • Testing
    • Add unittests.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ytcompdl-1.0.2.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

ytcompdl-1.0.2-py3-none-any.whl (18.6 kB view details)

Uploaded Python 3

File details

Details for the file ytcompdl-1.0.2.tar.gz.

File metadata

  • Download URL: ytcompdl-1.0.2.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for ytcompdl-1.0.2.tar.gz
Algorithm Hash digest
SHA256 c7a633b04856872d95e4ac75fee4558b4a1b8803bd2b293d47b210774625410a
MD5 205ca5536d36d4d7a385936d8498d2e4
BLAKE2b-256 78ec01dda724e132b6bf1b17c6042bb3967da11c90da61a1a228394e2265fc6f

See more details on using hashes here.

File details

Details for the file ytcompdl-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: ytcompdl-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 18.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.11

File hashes

Hashes for ytcompdl-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 f381cf6148c59127dc95cf1da3e5b2ce99b04ae5eb8487c24ae994ff6a88e9df
MD5 10339345ce6de64d44552ff6f192fdd6
BLAKE2b-256 1ad27c523fdac03ff896f3514d23421cbdac623884efad0be76408e8162675a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page