Skip to main content

Batch-extract time-bounded audio/video segments from long YouTube videos via JSON config.

Project description

ytslice

Batch-extract time-bounded audio/video segments from long YouTube videos via a JSON config — cached, fail-soft, no manual yt-dlp + ffmpeg.


Prerequisites

Dependency Why Install
Python ≥ 3.10 runtime apt install -y python3 python3-pip
ffmpeg (+ ffprobe) cuts segments, parses durations apt install -y ffmpeg
Deno on PATH yt-dlp's n-parameter challenge solver (see below) curl -fsSL https://deno.land/install.sh | sh then export PATH="$HOME/.deno/bin:$PATH"

Why Deno

YouTube's player obfuscates stream URLs behind a JavaScript challenge (the n parameter). yt-dlp resolves it via the ejs:github remote-components plugin, which requires a JS runtime — deno is the default. Without deno on PATH, downloads of newer YouTube videos fail mid-stream with HTTP 403 Forbidden and the in-process fallback emits a warning. Install deno via the one-liner above, then deno --version should print a version before you proceed.


Install

pip install .
# or, for development:
pip install -e ".[dev]"

The ytslice CLI is installed via [project.scripts].


Quick start

config.json:

{
  "output_dir": "./output",
  "videos": [
    {
      "url": "https://www.youtube.com/watch?v=REPLACE_ME",
      "quality": "720",
      "segments": [
        { "name": "intro",     "start": "0:00:00", "end": "0:00:30", "mode": "audio" },
        { "name": "highlight", "start": "0:01:00", "end": "0:02:15", "mode": "both"  }
      ]
    }
  ]
}

Run:

ytslice --config config.json

Expected outputs (one segment per file, plus both produces a pair):

output/
└── <video_id>/
    ├── intro-<video_id>.mp3
    ├── highlight-<video_id>.mp3
    └── highlight-<video_id>.mp4

Config schema

Top level:

Field Type Required Notes
output_dir string no default ./output; outputs go to <output_dir>/<video_id>/
videos array yes non-empty list of video objects

Per video:

Field Type Required Notes
url string yes any URL yt-dlp accepts
quality string no one of `best
segments array yes non-empty list of segment objects

Per segment:

Field Type Required Notes
name string yes becomes the on-disk basename after sanitization
start string yes H:MM:SS or MM:SS, minutes/seconds 0–59
end string yes same format; must be > start
mode string yes one of audio (MP3), video (MP4 with audio), both (MP3 + MP4)

The quality enum is exactly what ytslice/config.py validates against. See config.example.json for a runnable example.


CLI flags

Flag Description
--config <path> path to JSON config (required)
--output-dir <path> override output_dir from config
--keep-cache keep source in cache after a successful run (disables auto-evict)
--no-cache bypass cache: do not look up, do not store; downloads to a tempdir and removes it after the run
--clear-cache empty the cache root before the run
--dry-run validate config and exit without downloading or cutting
-h, --help show help

--no-cache and --keep-cache are mutually exclusive.

--verbose was removed in 1.1.x — use PYTHONLOGLEVEL=DEBUG if you need more detail.

Environment variables

Variable Effect
YTSLICE_NO_PROGRESS=1 force-disable the rich progress UI (non-TTY opt-out)
YTSLICE_CACHE_DIR=<path> override cache root (default ~/.cache/ytslice/)
YTSLICE_E2E=1 enable network-using tests under pytest

The progress UI also auto-disables when stderr is not a TTY.


Cache lifecycle

Mode Behavior
default per-video auto-evict on success — the source file is removed from ~/.cache/ytslice/ once all that video's segments succeed
--keep-cache source files persist (useful when iterating on segment timestamps)
--no-cache source is downloaded to a tempdir, used once, and removed after the run
--clear-cache the entire cache root is purged at the start of the run (sentinel-guarded)

A failed segment keeps the source so a rerun does not re-download. Cache root is ~/.cache/ytslice/ (override via YTSLICE_CACHE_DIR); the directory is marked with a .ytslice-cache sentinel that --clear-cache refuses to purge directories without.


Output filename rules

The full V1 rule, in order:

<sanitize_filename(seg.name)>-<video_id><collision_suffix?><extension>
  • sanitize_filename strips path separators, control chars, leading/trailing dots+spaces; collapses whitespace; truncates at 200 chars; falls back to unnamed on empty input.
  • <video_id> is the YouTube ID resolved by yt-dlp (e.g. dQw4w9WgXcQ).
  • <collision_suffix?> is empty when the target path is free; otherwise _1, _2, … (1-indexed, no zero-pad), allocated to the lowest free index. For both-mode segments, the same _N is applied to both the .mp3 and the .mp4 so the pair never splits across different suffixes.
  • <extension> is .mp3 / .mp4 per mode.

Rerun caveat: re-running the same config without clearing ./output/<video_id>/ will produce _1, _2, … duplicates because pre-existing files are treated as collisions (by design — no silent overwrites). Either rm -rf ./output/<video_id>/ between runs, or rely on the _N accumulation if you want a versioned history.


Clean-Debian walkthrough

End-to-end in debian:stable-slim (container runs as root — no sudo; on a non-container Debian with a non-root user, prefix the apt commands with sudo yourself):

docker run --rm -it debian:stable-slim bash

# system deps
apt update
apt install -y python3 python3-pip python3-venv ffmpeg curl git

# deno (for yt-dlp's n-challenge solver)
curl -fsSL https://deno.land/install.sh | sh
export PATH="$HOME/.deno/bin:$PATH"
deno --version

# ytslice
git clone https://gitlab.com/MisterJB/ytslice.git
cd ytslice
python3 -m venv .venv
. .venv/bin/activate
pip install .

ytslice --help

# run the example (replace REPLACE_ME with a real YouTube ID first)
cp config.example.json config.json
sed -i 's/REPLACE_ME/dQw4w9WgXcQ/' config.json
sed -i 's/ANOTHER/dQw4w9WgXcQ/' config.json
ytslice --config config.json

# verify outputs
ls -la output/dQw4w9WgXcQ/
ffprobe -v error -show_entries format=duration output/dQw4w9WgXcQ/*.mp3 | head

# verify cache auto-evicted on success
ls -la ~/.cache/ytslice/
# Expected: directory exists, contains the .ytslice-cache sentinel,
# but no <video_id>.mp4 — auto-evict-on-success worked.

Testing

pytest                  # offline suite (default; mocks downloader + cutter)
YTSLICE_E2E=1 pytest    # additionally runs network-using tests

Done-criteria coverage lives in tests/test_done_criteria.py, mapping every PROJECT.md success metric to a runnable assertion.


License

GPL-3.0-or-later. See the LICENSE file for the full text.


Changelog (1.1.x)

  • Added --keep-cache, --no-cache, --clear-cache, --dry-run.
  • Added per-video quality field (best | 1080 | 720 | 480 | 360, default 1080).
  • Added rich progress UI (auto-disables in non-TTY; opt out via YTSLICE_NO_PROGRESS=1).
  • Filename collisions now resolved as _1, _2, … (pair-consistent for both mode).
  • LICENSE: GPL-3.0-or-later (PEP 639 SPDX).
  • Removed: --verbose (use PYTHONLOGLEVEL=DEBUG).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ytslice-1.1.0.tar.gz (64.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ytslice-1.1.0-py3-none-any.whl (36.7 kB view details)

Uploaded Python 3

File details

Details for the file ytslice-1.1.0.tar.gz.

File metadata

  • Download URL: ytslice-1.1.0.tar.gz
  • Upload date:
  • Size: 64.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ytslice-1.1.0.tar.gz
Algorithm Hash digest
SHA256 a474f488f9d94c8da4c6b6a6a9c0c7f338256339498c120547f723b2407fad5f
MD5 6bc1a425fbd469819de26ea8b20687b7
BLAKE2b-256 765b0143d20f85dc080d9480746f38127b4a819b87ae77e66df8197b6fcac0c8

See more details on using hashes here.

File details

Details for the file ytslice-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: ytslice-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for ytslice-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ac4554cfe7263860f0e85a5a22a8874b5ff809b08a0be42f0f1417e4ee35d3e
MD5 1d90ea2a7fc3ddcb14389bf313bb216d
BLAKE2b-256 846f9d9a0f4f92006ce9a6826ab7867c493a3f110d5945abea79d46f4045416b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page