Skip to main content

Python WebVTT API Implementation

Project description

WebVTT Python Parser

A Python implementation of the WebVTT (Web Video Text Tracks) format parser with strict validation and error handling.

Features

  • Full compliance with W3C WebVTT specification
  • Strict and lenient parsing modes
  • Comprehensive model validation:
    • Timestamp format validation
    • Cue timing consistency checks
    • Region setting validation
    • Position/size value range checks
  • Detailed error reporting with context
  • Support for:
    • Header metadata
    • Regions with scroll settings
    • Cue positioning and alignment
    • Multi-line cues
    • Voice spans and basic styling

Installation

pip install webvtt-python

Or using uv for faster installation:

uv add webvtt-python

Quick Start

from webvtt_python import WebVTTParser, WebVTT

# Parse from string
parser = WebVTTParser(strict=True)
content = """WEBVTT

00:00:01.000 --> 00:00:02.000
Hello world!

00:00:02.500 --> 00:00:05.000 position:50%
Multi-line
subtitle
"""

webvtt: WebVTT = parser.parse(content)
for cue in webvtt.cues:
    print(f"{cue.start_time:.1f}-{cue.end_time:.1f}s: {cue.text}")

Advanced Usage

Region Handling

content = """WEBVTT
REGION
id:test
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
"""

webvtt = parser.parse(content)
region = webvtt.regions[0]
print(f"Region {region.id}: {region.width}% width, {region.lines} lines")

Error Handling

try:
    parser.parse("00:00:01.000 --> 00:00:00.500\nInvalid timing")
except ValueError as e:
    print(f"Validation error: {e}")

Cue Settings

cue = webvtt.cues[0]
print(f"Position: {cue.position}%")
print(f"Alignment: {cue.text_alignment.value}")
print(f"Writing direction: {cue.writing_direction}")

Architecture

The WebVTT parser implements the W3C WebVTT specification.

graph TD
    subgraph "Parser Pipeline"
        A[".vtt File/String"] --> B["WebVTTParser"]
        B --> C{"Valid<br>WEBVTT?"}
        C -->|No| D["MalformedVTTError"]

        C -->|Yes| E["Header Processing"]
        E --> F["Style & Region<br>Parsing"]

        F --> G["Cue Processing"]
        G --> H{"Validation"}

        H -->|Invalid| I["ValidationError"]
        H -->|Valid| J["WebVTTFile"]

        J --> K["Output Formats"]
        K --> L1["JSON"]
        K --> L2["SRT"]
        K --> L3["WebVTT"]
    end

    classDef default fill:#2A2A2A,stroke:#666,color:#DDD
    classDef process fill:#4a90e2,stroke:#2171C7,color:white
    classDef error fill:#e74c3c,stroke:#c0392b,color:white
    classDef decision fill:#f39c12,stroke:#d35400,color:white
    classDef output fill:#2ecc71,stroke:#27ae60,color:white

    class A,B default
    class C,H decision
    class D,I error
    class E,F,G process
    class J,K,L1,L2,L3 output

Key Components

  1. Input Processing

    • File or string input
    • WEBVTT validation
    • BOM handling
  2. Content Parsing

    • Header and metadata
    • Styles and regions
    • Cue timing and text
  3. Output Options

    • JSON serialization
    • SRT conversion
    • WebVTT formatting

API Reference

WebVTTParser

WebVTTParser(strict: bool = True)
  • strict: Raise errors for invalid content (default True)

Methods:

  • parse(content: str | TextIO) -> WebVTT

WebVTT Model

class WebVTT:
    cues: List[WebVTTCue]
    regions: List[WebVTTRegion]
    styles: List[str]
    header_comments: List[str]

WebVTTCue

class WebVTTCue:
    start_time: float
    end_time: float
    text: str
    identifier: Optional[str]
    region: Optional[str]
    position: Optional[float]
    size: float
    text_alignment: TextAlignment
    # ... other properties

WebVTTRegion

class WebVTTRegion:
    id: str
    width: float
    lines: int
    region_anchor: Tuple[float, float]
    viewport_anchor: Tuple[float, float]
    scroll: str

Development

git clone https://github.com/yourusername/webvtt-python.git
cd webvtt-python
uv venv
source .venv/bin/activate
uv sync --system

Running Tests

uv run pytest tests/ -v

License

MIT License

Contributing

Contributions welcome! Please open an issue first to discuss proposed changes.

Acknowledgments

  • W3C WebVTT specification team
  • Python datetime module for timestamp parsing inspiration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webvtt_python-0.1.0.tar.gz (103.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webvtt_python-0.1.0-py3-none-any.whl (8.1 kB view details)

Uploaded Python 3

File details

Details for the file webvtt_python-0.1.0.tar.gz.

File metadata

  • Download URL: webvtt_python-0.1.0.tar.gz
  • Upload date:
  • Size: 103.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.1

File hashes

Hashes for webvtt_python-0.1.0.tar.gz
Algorithm Hash digest
SHA256 38870592a80d3dd35b30f0683db4e45bebff57cd3075cd7028347757725ceb8e
MD5 4d85c1d8d830f6680bba64525f592d48
BLAKE2b-256 f660b50bee018a0458c92b2c3b404a5b8ebdd16b0b1cef25866b6ed6ab2989b0

See more details on using hashes here.

File details

Details for the file webvtt_python-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for webvtt_python-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d5a4e68f6bc8afe13f999ae7c3ca39973434132be568106577c96a7dd077dfdd
MD5 b91d84448578a47f085be8468ed2d35b
BLAKE2b-256 f7629dc1eed7ecf76f777b6c4452e0b5a3f9870c82eaa8ad195868a2a5eee65f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page