Skip to main content

Python WebVTT API Implementation

Project description

WebVTT Python Parser

A Python implementation of the WebVTT (Web Video Text Tracks) format parser with strict validation and error handling.

Features

  • Full compliance with W3C WebVTT specification
  • Strict and lenient parsing modes
  • Comprehensive model validation:
    • Timestamp format validation
    • Cue timing consistency checks
    • Region setting validation
    • Position/size value range checks
  • Detailed error reporting with context
  • Support for:
    • Header metadata
    • Regions with scroll settings
    • Cue positioning and alignment
    • Multi-line cues
    • Voice spans and basic styling

Installation

pip install webvtt-python

Or using uv for faster installation:

uv add webvtt-python

Quick Start

from webvtt_python import WebVTTParser, WebVTT

# Parse from string
parser = WebVTTParser(strict=True)
content = """WEBVTT

00:00:01.000 --> 00:00:02.000
Hello world!

00:00:02.500 --> 00:00:05.000 position:50%
Multi-line
subtitle
"""

webvtt: WebVTT = parser.parse(content)
for cue in webvtt.cues:
    print(f"{cue.start_time:.1f}-{cue.end_time:.1f}s: {cue.text}")

Advanced Usage

Region Handling

content = """WEBVTT
REGION
id:test
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
"""

webvtt = parser.parse(content)
region = webvtt.regions[0]
print(f"Region {region.id}: {region.width}% width, {region.lines} lines")

Error Handling

try:
    parser.parse("00:00:01.000 --> 00:00:00.500\nInvalid timing")
except ValueError as e:
    print(f"Validation error: {e}")

Cue Settings

cue = webvtt.cues[0]
print(f"Position: {cue.position}%")
print(f"Alignment: {cue.text_alignment.value}")
print(f"Writing direction: {cue.writing_direction}")

Architecture

The WebVTT parser implements the W3C WebVTT specification.

graph TD
    subgraph "Parser Pipeline"
        A[".vtt File/String"] --> B["WebVTTParser"]
        B --> C{"Valid<br>WEBVTT?"}
        C -->|No| D["MalformedVTTError"]

        C -->|Yes| E["Header Processing"]
        E --> F["Style & Region<br>Parsing"]

        F --> G["Cue Processing"]
        G --> H{"Validation"}

        H -->|Invalid| I["ValidationError"]
        H -->|Valid| J["WebVTTFile"]

        J --> K["Output Formats"]
        K --> L1["JSON"]
        K --> L2["SRT"]
        K --> L3["WebVTT"]
    end

    classDef default fill:#2A2A2A,stroke:#666,color:#DDD
    classDef process fill:#4a90e2,stroke:#2171C7,color:white
    classDef error fill:#e74c3c,stroke:#c0392b,color:white
    classDef decision fill:#f39c12,stroke:#d35400,color:white
    classDef output fill:#2ecc71,stroke:#27ae60,color:white

    class A,B default
    class C,H decision
    class D,I error
    class E,F,G process
    class J,K,L1,L2,L3 output

Key Components

  1. Input Processing

    • File or string input
    • WEBVTT validation
    • BOM handling
  2. Content Parsing

    • Header and metadata
    • Styles and regions
    • Cue timing and text
  3. Output Options

    • JSON serialization
    • SRT conversion
    • WebVTT formatting

API Reference

WebVTTParser

WebVTTParser(strict: bool = True)
  • strict: Raise errors for invalid content (default True)

Methods:

  • parse(content: str | TextIO) -> WebVTT

WebVTT Model

class WebVTT:
    cues: List[WebVTTCue]
    regions: List[WebVTTRegion]
    styles: List[str]
    header_comments: List[str]

WebVTTCue

class WebVTTCue:
    start_time: float
    end_time: float
    text: str
    identifier: Optional[str]
    region: Optional[str]
    position: Optional[float]
    size: float
    text_alignment: TextAlignment
    # ... other properties

WebVTTRegion

class WebVTTRegion:
    id: str
    width: float
    lines: int
    region_anchor: Tuple[float, float]
    viewport_anchor: Tuple[float, float]
    scroll: str

Development

git clone https://github.com/yourusername/webvtt-python.git
cd webvtt-python
uv venv
source .venv/bin/activate
uv sync --system

Running Tests

uv run pytest tests/ -v

License

This software is licensed under the MIT License (MIT). See LICENSE.md for details.

Contributing

Contributions welcome! Please open an issue first to discuss proposed changes.

Acknowledgments

  • W3C WebVTT specification team
  • Python datetime module for timestamp parsing inspiration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webvtt_python-0.2.0.tar.gz (104.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webvtt_python-0.2.0-py3-none-any.whl (9.2 kB view details)

Uploaded Python 3

File details

Details for the file webvtt_python-0.2.0.tar.gz.

File metadata

  • Download URL: webvtt_python-0.2.0.tar.gz
  • Upload date:
  • Size: 104.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.1

File hashes

Hashes for webvtt_python-0.2.0.tar.gz
Algorithm Hash digest
SHA256 a1d48853895059f0b37a660774710cad7daffeecfd649067febf3eae33da3544
MD5 dc4eff9d1187e5b08060b3eba5afecbe
BLAKE2b-256 cc7d6b0a941f4a06e50cc3423832254112b47969439f3085e143025443be225e

See more details on using hashes here.

File details

Details for the file webvtt_python-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for webvtt_python-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4d4f0beb2654818a0984cd3074c698c3571bef215c593b7e0d208a5a29ceba35
MD5 faf827310767cf145c491f34f3357266
BLAKE2b-256 3207e321c790bcab0fdcf59b9799bb91b4756dabc2a34d628254b357853f777a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page