Skip to main content

Extract videos from web pages and download with yt-dlp

Project description

m3u8-extractor

Extract m3u8 stream URLs from web pages and download them with yt-dlp. Uses Selenium to render JavaScript-heavy pages, then hands the discovered stream URL to yt-dlp for reliable downloading.

Features

  • Automatic m3u8 extraction — loads pages with headless Chrome, finds m3u8 URLs in the rendered source
  • Smart extractor routing — tries yt-dlp's native extractors first, falls back to Selenium m3u8 only when needed
  • Three config sources — CLI flags, environment variables, and TOML config file (priority: CLI > env > TOML > defaults)
  • URL rules — pattern-matched per-site config in TOML (e.g. always use audio-only for music sites)
  • Batch downloads — read URLs from one or more files (or directories of files), with per-URL and group option overrides
  • Multiple output paths — save downloads to several directories at once (first is primary, extras receive copies)
  • Parallel downloads — download all URLs simultaneously by default, or limit concurrency
  • Clipboard watch mode — monitors clipboard for URLs and downloads automatically
  • Multiple m3u8 handling — warns when multiple streams are found, with options to select or filter
  • Adblock — optionally loads uBlock Origin Lite to bypass ad-heavy pages
  • Proxy support — separate proxies for browser and downloader
  • System yt-dlp — use the system binary or a custom yt-dlp path instead of the Python library
  • Pretty output — colored, symbol-coded progress with yt-dlp's built-in progress bar

Installation

Requirements: Python 3.11+, Chrome/Chromium, ChromeDriver

pipx install m3u8-extractor

PyPI

Or install from source:

pipx install -e .

Make sure ChromeDriver is in your PATH and matches your Chrome version.

Quick start

# Download a single URL
m3u8-extractor "https://example.com/video-page"

# Download from a URL list
m3u8-extractor

# Watch clipboard and auto-download
m3u8-extractor --watch

# Audio only, with adblock
m3u8-extractor --audio-only --adblock "https://example.com/video-page"

Usage

m3u8-extractor [url] [options]

If no URL is given and --watch is not set, URLs are read from a file (urls.txt by default).

General options

Flag Description
url URL to download directly (positional, optional)
-f, --urls-file Path to URL list file or directory (repeatable; directories load all .txt files)
-o, --output-path Output directory or filename template (repeatable; first is download target, extras receive copies)
--title-prefix String to prepend to every filename
--title-postfix String to append to every filename (before extension)
--referrer Referer header for requests
--use-base-url-as-referrer Auto-set referer from each page's base URL
--cookies Path to Netscape-format cookies file
--user-agent Custom User-Agent for yt-dlp and browser requests
-q, --quality yt-dlp format selector (e.g. bestvideo+bestaudio)
--transcode Transcode to format after download (e.g. mp4, mkv)
-c, --config Path to TOML config file or directory (repeatable, later files override; directories load all .toml files)
--scan-depth Max directory recursion depth for -f/-c directories (0 = top-level only (default), 1 = one level, -1 = unlimited)

Download modes

Flag Description
--thumbnail Download thumbnail alongside video
--thumbnail-only Download only the thumbnail
--captions Download captions alongside video
--captions-only Download only captions
--audio-only Download only the audio stream
--video-only Download only the video stream (no audio)
--video-and-captions-only Download video and captions (no audio)
--overwrite Overwrite existing files (default)
--no-overwrite Skip download if output file exists

yt-dlp binary

Flag Description
--use-system-ytdlp Use the system yt-dlp binary instead of the Python library
--yt-dlp-path Path to a specific yt-dlp binary
--ytdlp-args Extra raw arguments forwarded to yt-dlp (e.g. '--limit-rate 1M')
--generic-impersonate Pass --extractor-args "generic:impersonate" for Cloudflare 403 challenges

Parallelism

Flag Description
-p, --parallel Number of parallel downloads: a number, all (default), cores, or logical_cores
--speed-unit Speed display in progress bar: bytes (default, e.g. MB/s) or bits (e.g. Mbps)
--scan-depth Max recursion depth when -f or -c is a directory (default 0, -1 = unlimited)

Stream selection

Flag Description
--stream-type Which stream types to look for: both (default), m3u8, or video
--m3u8-select Which stream when multiple found: first (default), last, all, or interactive
--m3u8-filter Regular expression to filter m3u8 URLs before selection
--video-filter Regular expression to filter direct video URLs (mp4, webm, etc.) before selection

Adblock

Flag Description
--adblock Load uBlock Origin Lite in Chrome (auto-downloaded on first use)
--adblock-strictness Filtering level: basic, optimal, or complete (default)
--adblock-extension Path to a custom .crx adblocker extension

Extractor selection

Flag Description
--extractor Strategy: auto (default, try yt-dlp native then m3u8), ytdlp (native only), m3u8 (Selenium only)
--extractors Comma-separated allowlist of yt-dlp extractor names (e.g. youtube,vimeo)
--use-selenium-session-for-download Reuse Selenium request headers/cookies for extracted stream URL downloads

Proxy

Flag Description
--proxy Proxy for yt-dlp downloads (e.g. socks5://127.0.0.1:1080)
--browser-proxy Proxy for Chrome (defaults to --proxy if not set)

SSL

Flag Description
--ignore-ssl-errors Ignore SSL certificate errors in browser and yt-dlp

localStorage

Flag Description
--localstorage KEY=VALUE Set a localStorage entry before page load (repeatable)

Example: --localstorage "jwplayer.qualityLabel=HQ" to force HQ quality on JWPlayer sites.

Headers & authentication

Flag Description
--header NAME=VALUE Custom HTTP header for browser & yt-dlp (repeatable)
--auth USER:PASS HTTP basic auth credentials

cookies can be either a Netscape cookie file path or direct cookie values in TOML ([cookies]). Both are applied to Selenium and yt-dlp for auth-gated pages.

Watch mode

Flag Description
-w, --watch Watch clipboard for URLs and download automatically
--watch-interval Polling interval in seconds (default: 1.0)
--watch-use-current Download the current clipboard URL immediately when watch starts (default)
--no-watch-use-current Ignore the current clipboard contents when watch starts

Configuration

Settings are resolved with this priority:

  1. CLI arguments (highest)
  2. Per-URL flags (in URL list file)
  3. Group directives (in URL list file)
  4. URL rules (pattern-matched from TOML config)
  5. Environment variables
  6. TOML config file
  7. Built-in defaults (lowest)

Config file

Place a config.toml in the current directory or ~/.config/m3u8-extractor/config.toml:

# config.toml

urls_file = "urls.txt"                # or a list: ["batch1.txt", "batch2.txt"]
output_path = "downloads/"            # or a list: ["downloads/", "/mnt/backup/"]
title_prefix = ""
title_postfix = ""
quality = "bestvideo+bestaudio"
transcode = "mp4"
parallel = "all"
speed_unit = "bytes"   # "bytes" (KB/s, MB/s) or "bits" (Kbps, Mbps)
scan_depth = 0         # directory recursion depth (0 = top-level, -1 = unlimited)
watch_use_current = true  # download current clipboard URL when --watch starts

referrer = ""
use_base_url_as_referrer = false
# cookies = "/path/to/cookies.txt"  # Netscape cookie file

#[cookies]                          # Alternative: direct cookie values
#sessionid = "abc123"
#cf_clearance = "your_token_here"

use_system_ytdlp = false
# yt_dlp_path = "/usr/local/bin/yt-dlp"
# generic_impersonate = false   # adds --extractor-args "generic:impersonate"

extractor = "auto"    # "auto", "ytdlp", or "m3u8"
# extractors = "youtube,vimeo"  # restrict yt-dlp to these extractors
# use_selenium_session_for_download = false
# replay Selenium headers/cookies in yt-dlp

m3u8_select = "first"    # "first", "last", "all", or "interactive"
# m3u8_filter = "pattern"

adblock = false
# adblock_extension = "/path/to/extension.crx"

# proxy = "socks5://127.0.0.1:1080"
# browser_proxy = "http://127.0.0.1:8080"

ignore_ssl_errors = false

thumbnail = false
thumbnail_only = false
captions = false
captions_only = false
audio_only = false
video_only = false
video_and_captions_only = false

URL rules

Define per-site config using regular expression patterns in [[url_rules]] sections:

# Use yt-dlp native extractor for YouTube
[[url_rules]]
pattern = "youtube\\.com|youtu\\.be"
extractor = "ytdlp"

# Audio only for a music site
[[url_rules]]
pattern = "example\\.com/music"
audio_only = true
quality = "bestaudio"
output_path = "music/"

# Extra options for a sketchy site
[[url_rules]]
pattern = "sketchy-site\\.com"
adblock = true
generic_impersonate = true
ignore_ssl_errors = true
proxy = "socks5://127.0.0.1:1080"

Rules are checked in order — all matching rules are merged, with later rules overriding earlier ones. Any config option can be used in a rule.

Environment variables

Every option has a corresponding environment variable prefixed with M3U8_:

M3U8_URLS_FILE=urls.txt
M3U8_OUTPUT_PATH=downloads/
M3U8_TITLE_PREFIX=""
M3U8_TITLE_POSTFIX=""
M3U8_REFERRER=""
M3U8_USE_BASE_URL_AS_REFERRER=false
M3U8_COOKIES=""
M3U8_QUALITY="bestvideo+bestaudio"
M3U8_TRANSCODE=mp4
M3U8_YT_DLP_PATH=""
M3U8_USE_SYSTEM_YTDLP=false
M3U8_GENERIC_IMPERSONATE=false
M3U8_EXTRACTOR=auto
M3U8_EXTRACTORS=""
M3U8_PARALLEL=all
M3U8_SPEED_UNIT=bytes
M3U8_SELECT=first
M3U8_FILTER=""
M3U8_ADBLOCK=false
M3U8_ADBLOCK_EXTENSION=""
M3U8_PROXY=""
M3U8_BROWSER_PROXY=""
M3U8_IGNORE_SSL_ERRORS=false
M3U8_THUMBNAIL=false
M3U8_THUMBNAIL_ONLY=false
M3U8_CAPTIONS=false
M3U8_CAPTIONS_ONLY=false
M3U8_AUDIO_ONLY=false
M3U8_VIDEO_ONLY=false
M3U8_VIDEO_AND_CAPTIONS_ONLY=false
M3U8_SCAN_DEPTH=0
M3U8_WATCH_USE_CURRENT=true

Boolean values accept 1, true, yes, on (case-insensitive).

Config file resolution

Both config.toml and urls.txt are searched in order:

  1. Current working directory
  2. ~/.config/m3u8-extractor/ (respects $XDG_CONFIG_HOME)

Use -c or -f to specify an explicit path.

Multiple files and directories

-f and -c are repeatable and accept directories:

# Multiple URL files
m3u8-extractor -f batch1.txt -f batch2.txt

# A directory of URL files (loads all .txt files, sorted alphabetically)
m3u8-extractor -f ~/url-batches/

# A directory of config files (loads all .toml files, later override earlier)
m3u8-extractor -c /etc/m3u8-extractor/conf.d/

# Control recursion depth (default: 0 = top-level only)
m3u8-extractor -f ~/url-batches/ --scan-depth 1    # one level of subdirs
m3u8-extractor -f ~/url-batches/ --scan-depth -1   # unlimited recursion

# Mix files and directories freely
m3u8-extractor -f batch1.txt -f ~/more-urls/ -c base.toml -c ~/overrides.d/

Files within a directory are sorted alphabetically, so numeric prefixes like 01-music.txt, 02-videos.txt control ordering. Hidden files (starting with .) are skipped.

TOML config also supports lists:

urls_file = ["batch1.txt", "batch2.txt"]
# or a directory
urls_file = "url-batches/"

Multiple output paths

-o is repeatable. The first path is the download target; after each successful download, files are copied to every additional path:

m3u8-extractor -o downloads/ -o /mnt/nas/videos/ -o ~/backup/

TOML config also supports a list:

output_path = ["downloads/", "/mnt/nas/videos/", "~/backup/"]

All related files (video, subtitles, thumbnails) are copied. Destination directories are created automatically.

URL list format

The URL list file supports three formats per line, plus group directives:

# Comments start with #

# 1. Just a URL (uses page title as filename)
https://example.com/video1

# 2. URL followed by a title/output path (space-separated)
https://example.com/video2 My Custom Title

# 3. URL with per-URL option flags
https://example.com/video3 --audio-only -o "music/song"
https://example.com/video4 --captions -q "bestvideo+bestaudio"
https://example.com/video5 -o "downloads/" --thumbnail --transcode mkv

Per-URL flags override the global config for that specific download. All CLI flags are supported, including repeatable -o for multiple output paths:

https://example.com/important -o downloads/ -o /mnt/backup/

Group directives

Use --- to set options for a group of URLs:

# Start an audio-only group
--- --audio-only -q "bestaudio"
https://example.com/song1
https://example.com/song2
https://example.com/song3

# Switch to a different group with captions
--- --captions --transcode mkv
https://example.com/lecture1
https://example.com/lecture2

# Reset to global defaults
---
https://example.com/normal-video

# Per-URL options still override the group
--- --audio-only
https://example.com/song4
https://example.com/video5 --video-only   # overrides audio-only for this URL

Group options apply to all URLs that follow, until the next --- directive. Use --- alone to reset back to global defaults.

Examples

# Basic one-off download
m3u8-extractor "https://example.com/video"

# Download with custom output and quality
m3u8-extractor -o my-video -q "bestvideo+bestaudio" \
  "https://example.com/video"

# Batch download with 4 parallel workers, using adblock
m3u8-extractor -p 4 --adblock

# Audio only, through a proxy
m3u8-extractor --audio-only \
  --proxy socks5://127.0.0.1:1080 \
  "https://example.com/video"

# Watch clipboard, download captions too
m3u8-extractor --watch --captions

# Watch clipboard but skip whatever is currently copied
m3u8-extractor --watch --no-watch-use-current

# Save to multiple locations
m3u8-extractor -o downloads/ -o /mnt/nas/videos/ \
  "https://example.com/video"

# Batch from a directory of URL files
m3u8-extractor -f ~/url-batches/

# Multiple configs: base + overrides
m3u8-extractor -c base.toml -c site-overrides.toml

# Use system yt-dlp at a custom path
m3u8-extractor --yt-dlp-path /opt/bin/yt-dlp "https://example.com/video"

# Download all m3u8 streams found on a page
m3u8-extractor --m3u8-select all "https://example.com/multi-stream"

# Filter m3u8 URLs by pattern
m3u8-extractor --m3u8-filter "1080p" "https://example.com/video"

# Use yt-dlp native extractor only (skip Selenium)
m3u8-extractor --extractor ytdlp "https://youtube.com/watch?v=abc123"

# Restrict to specific extractors
m3u8-extractor --extractors "youtube,vimeo" "https://youtube.com/watch?v=abc123"

License

This project is licensed under the GNU Affero General Public License v3.0 only (AGPL-3.0-only).

See LICENSE for the full text.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

m3u8_extractor-1.1.0.tar.gz (61.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

m3u8_extractor-1.1.0-py3-none-any.whl (47.2 kB view details)

Uploaded Python 3

File details

Details for the file m3u8_extractor-1.1.0.tar.gz.

File metadata

  • Download URL: m3u8_extractor-1.1.0.tar.gz
  • Upload date:
  • Size: 61.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for m3u8_extractor-1.1.0.tar.gz
Algorithm Hash digest
SHA256 491836f824617a0a961e6bafff63b706a8a53fa47cab80b6fde0f8591fc0e761
MD5 b70b0a25f5971cb90f8e99a1038534e6
BLAKE2b-256 773f507bee48b1a765dcfa50b45f9420fd1c595a0b470c028e97aee50f7fad69

See more details on using hashes here.

Provenance

The following attestation bundles were made for m3u8_extractor-1.1.0.tar.gz:

Publisher: publish.yml on Skyluker4/m3u8-extractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file m3u8_extractor-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: m3u8_extractor-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 47.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for m3u8_extractor-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ce41a3789100daffb9d3424feb5fbaff19642d0f7af9c021e38eedc2c709bdac
MD5 a4073d07cfa9a2ae564c8e5657b9ac64
BLAKE2b-256 8f94f18bd0a558f0fe5ea35faac7b3d6b578b479d3ff887873f39b83c5446725

See more details on using hashes here.

Provenance

The following attestation bundles were made for m3u8_extractor-1.1.0-py3-none-any.whl:

Publisher: publish.yml on Skyluker4/m3u8-extractor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page