Skip to main content

Python API wrapper and CLI for MediathekViewWeb

Project description

mediathek-py

Python 3.12+ License: MIT

A Python API wrapper and CLI for MediathekViewWeb, the search interface for German public broadcasting media libraries (ARD, ZDF, Arte, 3Sat, SWR, BR, MDR, NDR, WDR, HR, RBB, ORF, SRF, and more).

Features

  • 🔍 Powerful search with prefix syntax for filtering by channel, topic, title, and description
  • 📺 Download videos in HD, medium, or low quality with progress bars
  • 📦 Batch download entire series seasons with automatic episode detection
  • 🐍 Fluent Python API with builder pattern for programmatic use
  • 💻 Beautiful CLI with Rich-formatted tables and panels
  • 📋 Pydantic models for type-safe request/response handling

Installation

# Install with uv (recommended)
uv add mediathek-py

# Or install with pip
pip install mediathek-py

Requirements: Python 3.12+

Quick Start

CLI

# Search for content
mediathek search "tagesschau"

# Search with filters
mediathek search "!ard #tagesschau"

# Get detailed info about the first result
mediathek info "tagesschau"

# Download a video
mediathek download "tagesschau" --quality hd -o video.mp4

# Batch download an entire series
mediathek batch "#Feuer & Flamme" --season 1 --quality hd -o ./downloads/

Python Library

from mediathek_py import Mediathek, QueryField, SortField, SortOrder

with Mediathek() as m:
    # Fluent builder API
    result = (
        m.search()
        .query([QueryField.TOPIC, QueryField.TITLE], "tagesschau")
        .duration_min(10)  # in minutes
        .sort_by(SortField.TIMESTAMP)
        .sort_order(SortOrder.DESCENDING)
        .size(10)
        .execute()
    )

    for item in result.results:
        print(f"{item.channel}: {item.title}")

CLI Reference

Global Options

Option Description
--version Show version and exit
--help Show help message and exit

mediathek search

Search the MediathekViewWeb database.

mediathek search [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Type Default Description
--sort-by channel, timestamp, duration Sort results by field
--sort-order asc, desc Sort direction
--size Integer 15 Number of results to return
--offset Integer 0 Pagination offset
--future / --no-future Flag --no-future Include future broadcasts
--everywhere Flag Search all fields for unprefixed terms

Examples

# Basic search
mediathek search "tagesschau"

# Filter by channel and topic
mediathek search "!ard #tagesschau"

# Search with sorting and pagination
mediathek search "dokumentation" --sort-by timestamp --sort-order desc --size 20

# Include future broadcasts
mediathek search "live" --future

# Search everywhere (all fields)
mediathek search "klimawandel" --everywhere

mediathek info

Display detailed information about the first search result.

mediathek info [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Description
--everywhere Search all fields for unprefixed terms

Examples

# Get info about a specific show
mediathek info "#tagesschau"

# Get info from a specific channel
mediathek info "!zdf #heute"

Output includes:

  • Channel, topic, and title
  • Duration and broadcast date
  • Description (if available)
  • Website URL
  • Video URLs (standard, HD, low quality)
  • Subtitle URL (if available)

mediathek download

Download a video from search results.

mediathek download [OPTIONS] QUERY

Arguments

Argument Description
QUERY Search query using prefix syntax

Options

Option Type Default Description
--quality hd, medium, low hd Video quality preference
-o, --output Path Auto-generated Output file path
--everywhere Flag Search all fields for unprefixed terms

Quality Fallback

If the preferred quality is not available, the downloader automatically falls back:

  • hdmediumlow
  • mediumhdlow
  • lowmediumhd

Examples

# Download in HD quality
mediathek download "#tagesschau" --quality hd

# Download with custom filename
mediathek download "!arte #dokumentation" -o doku.mp4

# Download in low quality (smaller file)
mediathek download "nachrichten" --quality low

mediathek batch

Batch download all episodes of a series. Automatically detects season and episode numbers from title patterns ((SXX/EXX) and Folge N).

mediathek batch [OPTIONS] QUERY

Arguments

Argument Description
QUERY Show topic to search for. Use #topic prefix or plain text

Options

Option Type Default Description
-s, --season Integer Filter to a specific season number
--quality hd, medium, low hd Video quality preference
-o, --output Path . Output directory (episodes saved to {output}/{topic}/)
-y, --yes Flag Skip confirmation prompt

Behavior

  1. Searches all results for the given topic, paginating automatically
  2. Parses season/episode info from titles (deduplicates, sorts by season then episode)
  3. Displays a preview table of found episodes
  4. Prompts for confirmation (unless --yes)
  5. Downloads sequentially into {output}/{topic}/s01e01.mp4 format
  6. Skips files that already exist, continues past individual failures

Examples

# Preview all episodes (will prompt before downloading)
mediathek batch "#Feuer & Flamme"

# Download only season 3 in HD
mediathek batch "#Feuer & Flamme" --season 3

# Download everything, skip confirmation
mediathek batch "#Feuer & Flamme" --yes -o ./downloads/

# Download in low quality to save space
mediathek batch "Tatortreiniger" --quality low -o ./shows/

Search Prefix Syntax

The search query supports a powerful prefix syntax for filtering:

Prefix Field Example Description
! Channel !ard Filter by channel name
# Topic #tagesschau Filter by topic/show name
+ Title +nachrichten Filter by title
* Description *klimawandel Filter by description
> Min duration >10 Minimum duration in minutes
< Max duration <30 Maximum duration in minutes
(none) Topic + Title tagesschau Search topic and title fields

Syntax Rules

  1. Multiple filters can be combined in one query:

    mediathek search "!ard #tagesschau >10 <30"
    
  2. Commas in prefixed tokens are replaced with spaces:

    mediathek search "#sturm,der,liebe"  # Searches for "sturm der liebe"
    
  3. Unprefixed terms search topic and title by default. Use --everywhere to search all fields.

  4. Duration values are specified in minutes (converted to seconds for the API).

Examples

# ARD channel, tagesschau topic, 10-30 minutes
mediathek search "!ard #tagesschau >10 <30"

# ZDF documentaries about nature
mediathek search "!zdf #dokumentation *natur"

# Any content about climate, at least 15 minutes
mediathek search "klimawandel >15" --everywhere

# Arte films, sorted by most recent
mediathek search "!arte #spielfilm" --sort-by timestamp --sort-order desc

Python Library API

Basic Usage

from mediathek_py import Mediathek

# Using context manager (recommended)
with Mediathek() as m:
    result = m.search_by_string("!ard #tagesschau")
    for item in result.results:
        print(f"{item.channel}: {item.title}")

# Manual lifecycle management
m = Mediathek()
try:
    result = m.search_by_string("tagesschau")
finally:
    m.close()

Client Options

from mediathek_py import Mediathek

m = Mediathek(
    user_agent="my-app/1.0",  # Custom User-Agent header
    base_url="https://mediathekviewweb.de"  # API base URL
)

Fluent Builder API

The SearchBuilder provides a chainable interface for constructing queries:

from mediathek_py import Mediathek, QueryField, SortField, SortOrder

with Mediathek() as m:
    result = (
        m.search()
        .query([QueryField.TOPIC], "tagesschau")
        .query([QueryField.CHANNEL], "ARD")  # Multiple queries (AND logic)
        .duration_min(10)  # Minimum 10 minutes
        .duration_max(60)  # Maximum 60 minutes
        .include_future(False)  # Exclude future broadcasts
        .sort_by(SortField.TIMESTAMP)
        .sort_order(SortOrder.DESCENDING)
        .size(20)  # Return 20 results
        .offset(0)  # Start from first result
        .execute()
    )

Search Builder Methods

Method Parameters Description
query(fields, text) fields: list of QueryField, text: str Add a query filter
duration_min(minutes) minutes: int Set minimum duration in minutes
duration_max(minutes) minutes: int Set maximum duration in minutes
include_future(value) value: bool Include/exclude future broadcasts
sort_by(field) field: SortField Set sort field
sort_order(order) order: SortOrder Set sort direction
size(n) n: int Number of results to return
offset(n) n: int Pagination offset
execute() Execute the query and return results

String-Based Search

For quick searches using prefix syntax:

from mediathek_py import Mediathek

with Mediathek() as m:
    # Simple string search
    result = m.search_by_string("!ard #tagesschau >10 <60")
    
    # Search everywhere (all fields)
    result = m.search_by_string("klimawandel", search_everywhere=True)
    
    # Get a builder for further customization
    builder = m.build_from_string("!zdf #heute")
    result = builder.size(5).sort_by(SortField.TIMESTAMP).execute()

Downloading Videos

from pathlib import Path
from mediathek_py import Mediathek

with Mediathek() as m:
    result = m.search_by_string("#tagesschau")
    item = result.results[0]
    
    # Basic download
    m.download(item.url_video, Path("video.mp4"))
    
    # Download with progress callback
    def on_progress(downloaded: int, total: int | None):
        if total:
            percent = (downloaded / total) * 100
            print(f"Downloaded: {percent:.1f}%")
    
    m.download(
        item.url_video_hd or item.url_video,
        Path("video_hd.mp4"),
        progress_callback=on_progress
    )

Series & Batch Operations

Collect and process entire series programmatically:

from mediathek_py import Mediathek, collect_series, parse_episode_info

# Parse episode info from a title string
info = parse_episode_info("Folge 6: Explosion bei Brand (S06/E06)")
print(info.season, info.episode)  # 6, 6

# Also handles "Folge N" format (defaults to season 1)
info = parse_episode_info("Folge 3: Some Episode")
print(info.season, info.episode)  # 1, 3

# Collect all episodes for a topic (paginates automatically)
with Mediathek() as m:
    episodes = collect_series(m, "Feuer & Flamme")

    for ep in episodes:
        print(f"S{ep.info.season:02d}E{ep.info.episode:02d}: {ep.item.title}")
        print(f"  File: {ep.filename()}")  # s06e06.mp4
        print(f"  URL:  {ep.item.url_video_hd}")

collect_series() handles pagination, deduplication (by season/episode), filtering of unparseable titles, and returns episodes sorted by season then episode. Deduplication keeps the earliest-timestamp occurrence.


Data Models

QueryResult

Returned by search operations:

class QueryResult:
    query_info: QueryInfo  # Metadata about the query
    results: list[Item]    # List of matching items

QueryInfo

Metadata about the search:

class QueryInfo:
    filmliste_timestamp: int   # Timestamp of the media list
    result_count: int          # Number of results returned
    total_results: int         # Total matching results
    search_engine_time: float  # Query execution time

Item

A single media item:

class Item:
    channel: str              # Broadcasting channel (e.g., "ARD", "ZDF")
    topic: str                # Show/topic name
    title: str                # Episode/video title
    description: str | None   # Description text
    timestamp: int            # Broadcast timestamp (Unix)
    duration: int | None      # Duration in seconds (None for livestreams)
    size: int | None          # File size in bytes
    url_website: str          # Website URL
    url_video: str            # Standard quality video URL
    url_video_hd: str | None  # HD video URL
    url_video_low: str | None # Low quality video URL
    url_subtitle: str | None  # Subtitle URL
    filmliste_timestamp: int  # Media list timestamp
    id: str                   # Unique item ID

SeriesEpisode

Returned by collect_series():

class SeriesEpisode:
    item: Item          # The full media item from the API
    info: EpisodeInfo   # Parsed season/episode numbers

    def filename(self) -> str:  # Returns "s01e06.mp4" format
        ...

EpisodeInfo

class EpisodeInfo:
    season: int    # Season number (defaults to 1 for "Folge N" format)
    episode: int   # Episode number

Enums

from mediathek_py import QueryField, SortField, SortOrder

# Query fields for filtering
QueryField.CHANNEL      # "channel"
QueryField.TOPIC        # "topic"
QueryField.TITLE        # "title"
QueryField.DESCRIPTION  # "description"

# Sort fields
SortField.CHANNEL       # "channel"
SortField.TIMESTAMP     # "timestamp"
SortField.DURATION      # "duration"

# Sort order
SortOrder.ASCENDING     # "asc"
SortOrder.DESCENDING    # "desc"

Error Handling

The library provides a hierarchy of exceptions:

from mediathek_py import MediathekError, ApiError, EmptyResponseError

try:
    with Mediathek() as m:
        result = m.search_by_string("test")
except ApiError as e:
    # API returned an error response
    print(f"API errors: {e.messages}")
except EmptyResponseError:
    # API returned no result and no error
    print("Empty response from API")
except MediathekError as e:
    # Base exception for all library errors (including HTTP errors)
    print(f"Error: {e}")

Exception Types

Exception Description
MediathekError Base exception for all library errors
ApiError API returned an error response (has .messages list)
EmptyResponseError API returned neither result nor error

Development

Setup

# Clone the repository
git clone https://github.com/maxboettinger/mediathek-py.git
cd mediathek-py

# Install with uv (installs editable + dev dependencies)
uv sync

Running Tests

# Run all tests
uv run pytest

# Run with verbose output
uv run pytest -v

# Run a specific test file
uv run pytest tests/test_client.py

# Run a specific test
uv run pytest tests/test_client.py::TestSearchBuilder::test_sends_correct_request -v

Project Structure

mediathek-py/
├── src/mediathek_py/
│   ├── __init__.py      # Public API exports
│   ├── client.py        # Mediathek client and SearchBuilder
│   ├── models.py        # Pydantic request/response models
│   ├── exceptions.py    # Exception hierarchy
│   ├── series.py        # Series episode parsing and collection
│   └── cli.py           # Click CLI with Rich output
├── tests/
│   ├── conftest.py      # Test fixtures
│   ├── test_client.py   # Client/builder tests
│   ├── test_models.py   # Model validation tests
│   ├── test_series.py   # Series parsing/collection tests
│   └── test_cli.py      # CLI integration tests
└── pyproject.toml       # Project configuration (uv/hatchling)

License

MIT License - see LICENSE for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mediathek_py-0.1.1.tar.gz (58.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mediathek_py-0.1.1-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file mediathek_py-0.1.1.tar.gz.

File metadata

  • Download URL: mediathek_py-0.1.1.tar.gz
  • Upload date:
  • Size: 58.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.5.25

File hashes

Hashes for mediathek_py-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e2ae65fe548b76914f98c327afc38631efd81f6072e34e6ed8d462df2ce6bc12
MD5 7ba1792eec0eb406aaece8c8fee59bcb
BLAKE2b-256 6e343da400ca5a9a659035ad9c76bf4bb1b3ba0000bcc41a8490764688ec74fa

See more details on using hashes here.

File details

Details for the file mediathek_py-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mediathek_py-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 992004dde917da57d0e4932773b08a3b82e9ac820abf920ec3109cf9a75290c7
MD5 2c687cabc7d5114ba8d503ec3a0acc6f
BLAKE2b-256 b04f1795001969ef9b29de61db257f0aa9207dab1edf9abfa6dc193188516fea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page