Skip to main content

Python API wrapper and CLI for MediathekViewWeb

Project description

mediathek-py

Python 3.12+ License: MIT

A Python API wrapper and CLI for MediathekViewWeb, the search interface for German public broadcasting media libraries (ARD, ZDF, Arte, 3Sat, SWR, BR, MDR, NDR, WDR, HR, RBB, ORF, SRF, and more).

Features

  • ๐Ÿ” Powerful search with prefix syntax for filtering by channel, topic, title, and description
  • ๐Ÿ“บ Download videos in HD, medium, or low quality with progress bars
  • ๐Ÿ“ฆ Batch download entire series seasons with automatic episode detection
  • ๐Ÿ Fluent Python API with builder pattern for programmatic use
  • ๐Ÿ’ป Beautiful CLI with Rich-formatted tables and panels
  • ๐Ÿ“‹ Pydantic models for type-safe request/response handling

Installation

# Install with uv (recommended)
uv add mediathek-py

# Or install with pip
pip install mediathek-py

Requirements: Python 3.12+

Quick Start

CLI

# Search for content
mediathek search "tagesschau"

# Search with filters
mediathek search "!ard #tagesschau"

# Get detailed info about the first result
mediathek info "tagesschau"

# Download a video
mediathek download "tagesschau" --quality hd -o video.mp4

# Batch download an entire series
mediathek batch "#Feuer,&,Flamme" --season 1 --quality hd -o ./downloads/

Python Library

from mediathek_py import Mediathek, QueryField, SortField, SortOrder

with Mediathek() as m:
    # Fluent builder API
    result = (
        m.search()
        .query([QueryField.TOPIC, QueryField.TITLE], "tagesschau")
        .duration_min(10)  # in minutes
        .sort_by(SortField.TIMESTAMP)
        .sort_order(SortOrder.DESCENDING)
        .size(10)
        .execute()
    )

    for item in result.results:
        print(f"{item.channel}: {item.title}")

CLI Reference

Global Options

Option Description
--version Show version and exit
--help Show help message and exit

mediathek search

Search the MediathekViewWeb database.

mediathek search [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Type Default Description
--sort-by channel, timestamp, duration โ€“ Sort results by field
--sort-order asc, desc โ€“ Sort direction
--size Integer 15 Number of results to return
--offset Integer 0 Pagination offset
--future / --no-future Flag --no-future Include future broadcasts
--everywhere Flag โ€“ Search all fields for unprefixed terms

Examples

# Basic search
mediathek search "tagesschau"

# Filter by channel and topic
mediathek search "!ard #tagesschau"

# Search with sorting and pagination
mediathek search "dokumentation" --sort-by timestamp --sort-order desc --size 20

# Include future broadcasts
mediathek search "live" --future

# Search everywhere (all fields)
mediathek search "klimawandel" --everywhere

mediathek info

Display detailed information about the first search result.

mediathek info [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Description
--everywhere Search all fields for unprefixed terms

Examples

# Get info about a specific show
mediathek info "#tagesschau"

# Get info from a specific channel
mediathek info "!zdf #heute"

Output includes:

  • Channel, topic, and title
  • Duration and broadcast date
  • Description (if available)
  • Website URL
  • Video URLs (standard, HD, low quality)
  • Subtitle URL (if available)

mediathek download

Download a video from search results.

mediathek download [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Type Default Description
--quality hd, medium, low hd Video quality preference
-o, --output Path Auto-generated Output file path
--everywhere Flag โ€“ Search all fields for unprefixed terms

Quality Fallback

If the preferred quality is not available, the downloader automatically falls back:

  • hd โ†’ medium โ†’ low
  • medium โ†’ hd โ†’ low
  • low โ†’ medium โ†’ hd

Examples

# Download in HD quality
mediathek download "#tagesschau" --quality hd

# Download with custom filename
mediathek download "!arte #dokumentation" -o doku.mp4

# Download in low quality (smaller file)
mediathek download "nachrichten" --quality low

mediathek batch

Batch download all episodes of a series. Automatically detects season and episode numbers from title patterns ((SXX/EXX) and Folge N).

mediathek batch [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Type Default Description
-s, --season Integer โ€“ Filter to a specific season number
-e, --episode Integer โ€“ Filter to a specific episode number
--quality hd, medium, low hd Video quality preference
-o, --output Path . Output directory (episodes saved to {output}/{topic}/)
-y, --yes Flag โ€“ Skip confirmation prompt
--list-url Flag โ€“ Print download URLs instead of downloading
--use-date Flag โ€“ Use broadcast year as season number
--episode-pattern String โ€“ Regex with one capture group to extract episode number
--episode-field Choice title Item field for --episode-pattern (title, topic, etc.)
--season-pattern String โ€“ Regex with one capture group to extract season number
--season-field Choice title Item field for --season-pattern (title, topic, etc.)

Behavior

  1. Searches all results matching the query, paginating automatically
  2. Parses season/episode info from titles (deduplicates, sorts by season then episode)
  3. Displays a preview table of found episodes
  4. Prompts for confirmation (unless --yes)
  5. Downloads sequentially into {output}/{topic}/s01e01.mp4 format
  6. Skips files that already exist, continues past individual failures

Examples

# Preview all episodes (will prompt before downloading)
mediathek batch "#Feuer,&,Flamme"

# Download only season 3 in HD
mediathek batch "#Feuer,&,Flamme" --season 3

# Download everything, skip confirmation
mediathek batch "#Feuer,&,Flamme" --yes -o ./downloads/

# Download in low quality to save space
mediathek batch "#Tatortreiniger" --quality low -o ./shows/

# Filter by channel and topic with duration
mediathek batch "!ARD #tagesschau >10"

# Download sport events with custom episode extraction
mediathek batch "#Sportschau,Tour,de,France >60" --use-date --episode-pattern "^(\d+)(?=\.\s*Etappe)"

# List download URLs for use with external tools
mediathek batch "#Feuer,&,Flamme" --season 3 --list-url

# Pipe URLs to wget
mediathek batch "#Feuer,&,Flamme" --list-url | wget -i -

Search Prefix Syntax

The search query supports a powerful prefix syntax for filtering:

Prefix Field Example Description
! Channel !ard Filter by channel name
# Topic #tagesschau Filter by topic/show name
+ Title +nachrichten Filter by title
* Description *klimawandel Filter by description
> Min duration >10 Minimum duration in minutes
< Max duration <30 Maximum duration in minutes
(none) Topic + Title tagesschau Search topic and title fields

Syntax Rules

  1. Multiple filters can be combined in one query:

    mediathek search "!ard #tagesschau >10 <30"
    
  2. Commas in prefixed tokens are replaced with spaces:

    mediathek search "#sturm,der,liebe"  # Searches for "sturm der liebe"
    
  3. Unprefixed terms search topic and title by default. Use --everywhere to search all fields.

  4. Duration values are specified in minutes (converted to seconds for the API).

Examples

# ARD channel, tagesschau topic, 10-30 minutes
mediathek search "!ard #tagesschau >10 <30"

# ZDF documentaries about nature
mediathek search "!zdf #dokumentation *natur"

# Any content about climate, at least 15 minutes
mediathek search "klimawandel >15" --everywhere

# Arte films, sorted by most recent
mediathek search "!arte #spielfilm" --sort-by timestamp --sort-order desc

Python Library API

Basic Usage

from mediathek_py import Mediathek

# Using context manager (recommended)
with Mediathek() as m:
    result = m.search_by_string("!ard #tagesschau")
    for item in result.results:
        print(f"{item.channel}: {item.title}")

# Manual lifecycle management
m = Mediathek()
try:
    result = m.search_by_string("tagesschau")
finally:
    m.close()

Client Options

from mediathek_py import Mediathek

m = Mediathek(
    user_agent="my-app/1.0",  # Custom User-Agent header
    base_url="https://mediathekviewweb.de"  # API base URL
)

Fluent Builder API

The SearchBuilder provides a chainable interface for constructing queries:

from mediathek_py import Mediathek, QueryField, SortField, SortOrder

with Mediathek() as m:
    result = (
        m.search()
        .query([QueryField.TOPIC], "tagesschau")
        .query([QueryField.CHANNEL], "ARD")  # Multiple queries (AND logic)
        .duration_min(10)  # Minimum 10 minutes
        .duration_max(60)  # Maximum 60 minutes
        .include_future(False)  # Exclude future broadcasts
        .sort_by(SortField.TIMESTAMP)
        .sort_order(SortOrder.DESCENDING)
        .size(20)  # Return 20 results
        .offset(0)  # Start from first result
        .execute()
    )

Search Builder Methods

Method Parameters Description
query(fields, text) fields: list of QueryField, text: str Add a query filter
duration_min(minutes) minutes: int Set minimum duration in minutes
duration_max(minutes) minutes: int Set maximum duration in minutes
include_future(value) value: bool Include/exclude future broadcasts
sort_by(field) field: SortField Set sort field
sort_order(order) order: SortOrder Set sort direction
size(n) n: int Number of results to return
offset(n) n: int Pagination offset
execute() โ€“ Execute the query and return results

String-Based Search

For quick searches using prefix syntax:

from mediathek_py import Mediathek

with Mediathek() as m:
    # Simple string search
    result = m.search_by_string("!ard #tagesschau >10 <60")

    # Search everywhere (all fields)
    result = m.search_by_string("klimawandel", search_everywhere=True)

    # Get a builder for further customization
    builder = m.build_from_string("!zdf #heute")
    result = builder.size(5).sort_by(SortField.TIMESTAMP).execute()

Downloading Videos

from pathlib import Path
from mediathek_py import Mediathek

with Mediathek() as m:
    result = m.search_by_string("#tagesschau")
    item = result.results[0]

    # Basic download
    m.download(item.url_video, Path("video.mp4"))

    # Download with progress callback
    def on_progress(downloaded: int, total: int | None):
        if total:
            percent = (downloaded / total) * 100
            print(f"Downloaded: {percent:.1f}%")

    m.download(
        item.url_video_hd or item.url_video,
        Path("video_hd.mp4"),
        progress_callback=on_progress
    )

Series & Batch Operations

Collect and process entire series programmatically:

from mediathek_py import Mediathek, collect_series, parse_episode_info

# Parse episode info from a title string
info = parse_episode_info("Folge 6: Explosion bei Brand (S06/E06)")
print(info.season, info.episode)  # 6, 6

# Also handles "Folge N" format (defaults to season 1)
info = parse_episode_info("Folge 3: Some Episode")
print(info.season, info.episode)  # 1, 3

# Collect all episodes matching a query (paginates automatically)
with Mediathek() as m:
    episodes = collect_series(m, "#Feuer,&,Flamme")

    for ep in episodes:
        print(f"S{ep.info.season:02d}E{ep.info.episode:02d}: {ep.item.title}")
        print(f"  File: {ep.filename()}")  # s06e06.mp4
        print(f"  URL:  {ep.item.url_video_hd}")

collect_series() handles pagination, deduplication (by season/episode), filtering of unparseable titles, and returns episodes sorted by season then episode. Deduplication keeps the earliest-timestamp occurrence.


Data Models

QueryResult

Returned by search operations:

class QueryResult:
    query_info: QueryInfo  # Metadata about the query
    results: list[Item]    # List of matching items

QueryInfo

Metadata about the search:

class QueryInfo:
    filmliste_timestamp: int   # Timestamp of the media list
    result_count: int          # Number of results returned
    total_results: int         # Total matching results
    search_engine_time: float  # Query execution time

Item

A single media item:

class Item:
    channel: str              # Broadcasting channel (e.g., "ARD", "ZDF")
    topic: str                # Show/topic name
    title: str                # Episode/video title
    description: str | None   # Description text
    timestamp: int            # Broadcast timestamp (Unix)
    duration: int | None      # Duration in seconds (None for livestreams)
    size: int | None          # File size in bytes
    url_website: str          # Website URL
    url_video: str            # Standard quality video URL
    url_video_hd: str | None  # HD video URL
    url_video_low: str | None # Low quality video URL
    url_subtitle: str | None  # Subtitle URL
    filmliste_timestamp: int  # Media list timestamp
    id: str                   # Unique item ID

SeriesEpisode

Returned by collect_series():

class SeriesEpisode:
    item: Item          # The full media item from the API
    info: EpisodeInfo   # Parsed season/episode numbers

    def filename(self) -> str:  # Returns "s01e06.mp4" format
        ...

EpisodeInfo

class EpisodeInfo:
    season: int    # Season number (defaults to 1 for "Folge N" format)
    episode: int   # Episode number

Enums

from mediathek_py import QueryField, SortField, SortOrder

# Query fields for filtering
QueryField.CHANNEL      # "channel"
QueryField.TOPIC        # "topic"
QueryField.TITLE        # "title"
QueryField.DESCRIPTION  # "description"

# Sort fields
SortField.CHANNEL       # "channel"
SortField.TIMESTAMP     # "timestamp"
SortField.DURATION      # "duration"

# Sort order
SortOrder.ASCENDING     # "asc"
SortOrder.DESCENDING    # "desc"

Error Handling

The library provides a hierarchy of exceptions:

from mediathek_py import MediathekError, ApiError, EmptyResponseError

try:
    with Mediathek() as m:
        result = m.search_by_string("test")
except ApiError as e:
    # API returned an error response
    print(f"API errors: {e.messages}")
except EmptyResponseError:
    # API returned no result and no error
    print("Empty response from API")
except MediathekError as e:
    # Base exception for all library errors (including HTTP errors)
    print(f"Error: {e}")

Exception Types

Exception Description
MediathekError Base exception for all library errors
ApiError API returned an error response (has .messages list)
EmptyResponseError API returned neither result nor error

Development

Setup

# Clone the repository
git clone https://github.com/maxboettinger/mediathek-py.git
cd mediathek-py

# Install with uv (installs editable + dev dependencies)
uv sync

Running Tests

# Run all tests
uv run pytest

# Run with verbose output
uv run pytest -v

# Run a specific test file
uv run pytest tests/test_client.py

# Run a specific test
uv run pytest tests/test_client.py::TestSearchBuilder::test_sends_correct_request -v

Publishing to PyPI

The project includes a publish script for automated version bumping, building, and publishing:

# Set up PyPI token (first time only)
cp .env.example .env
# Edit .env and add your PyPI token

# Publish with version bump (patch by default)
./scripts/publish.sh [major|minor|patch]

# Examples:
./scripts/publish.sh patch  # 0.1.1 โ†’ 0.1.2
./scripts/publish.sh minor  # 0.1.1 โ†’ 0.2.0
./scripts/publish.sh major  # 0.1.1 โ†’ 1.0.0

The script will:

  1. Prompt for confirmation
  2. Update version in pyproject.toml
  3. Clean and build the package with uv build
  4. Commit the version bump and create a git tag
  5. Publish to PyPI using uv publish
  6. Push changes and tags to the repository

Requirements:

  • uv installed
  • PyPI API token in .env file
  • Git repository configured

Project Structure

mediathek-py/
โ”œโ”€โ”€ src/mediathek_py/
โ”‚   โ”œโ”€โ”€ __init__.py      # Public API exports
โ”‚   โ”œโ”€โ”€ client.py        # Mediathek client and SearchBuilder
โ”‚   โ”œโ”€โ”€ models.py        # Pydantic request/response models
โ”‚   โ”œโ”€โ”€ exceptions.py    # Exception hierarchy
โ”‚   โ”œโ”€โ”€ series.py        # Series episode parsing and collection
โ”‚   โ””โ”€โ”€ cli.py           # Click CLI with Rich output
โ”œโ”€โ”€ tests/
โ”‚   โ”œโ”€โ”€ conftest.py      # Test fixtures
โ”‚   โ”œโ”€โ”€ test_client.py   # Client/builder tests
โ”‚   โ”œโ”€โ”€ test_models.py   # Model validation tests
โ”‚   โ”œโ”€โ”€ test_series.py   # Series parsing/collection tests
โ”‚   โ””โ”€โ”€ test_cli.py      # CLI integration tests
โ””โ”€โ”€ pyproject.toml       # Project configuration (uv/hatchling)

License

MIT License - see LICENSE for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mediathek_py-0.1.5.tar.gz (79.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mediathek_py-0.1.5-py3-none-any.whl (25.4 kB view details)

Uploaded Python 3

File details

Details for the file mediathek_py-0.1.5.tar.gz.

File metadata

  • Download URL: mediathek_py-0.1.5.tar.gz
  • Upload date:
  • Size: 79.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.25

File hashes

Hashes for mediathek_py-0.1.5.tar.gz
Algorithm Hash digest
SHA256 774afb58a04e538911d67f4da8fa2d0b08b14511e3b900fba5082f77e92899ee
MD5 2270210537bc0a90caec777e2e1d1532
BLAKE2b-256 412490b47c1a223e0d8e118940147391e434b84b8e278fe1bd8f16acb53fc24b

See more details on using hashes here.

File details

Details for the file mediathek_py-0.1.5-py3-none-any.whl.

File metadata

File hashes

Hashes for mediathek_py-0.1.5-py3-none-any.whl
Algorithm Hash digest
SHA256 b6bf3b8633ec4f23c3e1d3df77982359cb59ee2d4a3e4fb0c5894738e6593ffe
MD5 a4e94961d42dfc10435e7264af5925ad
BLAKE2b-256 e06c2d062f783d12d1ea1554a5e62f319af80e8c7d50d6a1c8407b09fabf2ed4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page