Skip to main content

Python WebVTT API Implementation

Project description

WebVTT Python Parser

A Python implementation of the WebVTT (Web Video Text Tracks) format parser with strict validation and error handling.

Features

  • Full compliance with W3C WebVTT specification
  • Strict and lenient parsing modes
  • Comprehensive model validation:
    • Timestamp format validation
    • Cue timing consistency checks
    • Region setting validation
    • Position/size value range checks
  • Detailed error reporting with context
  • Support for:
    • Header metadata
    • Regions with scroll settings
    • Cue positioning and alignment
    • Multi-line cues
    • Voice spans and basic styling

Installation

pip install webvtt-python

Or using uv for faster installation:

uv add webvtt-python

Quick Start

from webvtt_python import WebVTTParser, WebVTT

# Parse from string
parser = WebVTTParser(strict=True)
content = """WEBVTT

00:00:01.000 --> 00:00:02.000
Hello world!

00:00:02.500 --> 00:00:05.000 position:50%
Multi-line
subtitle
"""

webvtt: WebVTT = parser.parse(content)
for cue in webvtt.cues:
    print(f"{cue.start_time:.1f}-{cue.end_time:.1f}s: {cue.text}")

Advanced Usage

Region Handling

content = """WEBVTT
REGION
id:test
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
"""

webvtt = parser.parse(content)
region = webvtt.regions[0]
print(f"Region {region.id}: {region.width}% width, {region.lines} lines")

Error Handling

try:
    parser.parse("00:00:01.000 --> 00:00:00.500\nInvalid timing")
except ValueError as e:
    print(f"Validation error: {e}")

Cue Settings

cue = webvtt.cues[0]
print(f"Position: {cue.position}%")
print(f"Alignment: {cue.text_alignment.value}")
print(f"Writing direction: {cue.writing_direction}")

Architecture

The WebVTT parser implements the W3C WebVTT specification.

graph TD
    subgraph "Parser Pipeline"
        A[".vtt File/String"] --> B["WebVTTParser"]
        B --> C{"Valid<br>WEBVTT?"}
        C -->|No| D["MalformedVTTError"]

        C -->|Yes| E["Header Processing"]
        E --> F["Style & Region<br>Parsing"]

        F --> G["Cue Processing"]
        G --> H{"Validation"}

        H -->|Invalid| I["ValidationError"]
        H -->|Valid| J["WebVTTFile"]

        J --> K["Output Formats"]
        K --> L1["JSON"]
        K --> L2["SRT"]
        K --> L3["WebVTT"]
    end

    classDef default fill:#2A2A2A,stroke:#666,color:#DDD
    classDef process fill:#4a90e2,stroke:#2171C7,color:white
    classDef error fill:#e74c3c,stroke:#c0392b,color:white
    classDef decision fill:#f39c12,stroke:#d35400,color:white
    classDef output fill:#2ecc71,stroke:#27ae60,color:white

    class A,B default
    class C,H decision
    class D,I error
    class E,F,G process
    class J,K,L1,L2,L3 output

Key Components

  1. Input Processing

    • File or string input
    • WEBVTT validation
    • BOM handling
  2. Content Parsing

    • Header and metadata
    • Styles and regions
    • Cue timing and text
  3. Output Options

    • JSON serialization
    • SRT conversion
    • WebVTT formatting

API Reference

WebVTTParser

WebVTTParser(strict: bool = True)
  • strict: Raise errors for invalid content (default True)

Methods:

  • parse(content: str | TextIO) -> WebVTT

WebVTT Model

class WebVTT:
    cues: List[WebVTTCue]
    regions: List[WebVTTRegion]
    styles: List[str]
    header_comments: List[str]

WebVTTCue

class WebVTTCue:
    start_time: float
    end_time: float
    text: str
    identifier: Optional[str]
    region: Optional[str]
    position: Optional[float]
    size: float
    text_alignment: TextAlignment
    # ... other properties

WebVTTRegion

class WebVTTRegion:
    id: str
    width: float
    lines: int
    region_anchor: Tuple[float, float]
    viewport_anchor: Tuple[float, float]
    scroll: str

Development

git clone https://github.com/yourusername/webvtt-python.git
cd webvtt-python
uv venv
source .venv/bin/activate
uv sync --system

Running Tests

uv run pytest tests/ -v

License

MIT License

Contributing

Contributions welcome! Please open an issue first to discuss proposed changes.

Acknowledgments

  • W3C WebVTT specification team
  • Python datetime module for timestamp parsing inspiration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

webvtt_python-0.1.1.tar.gz (103.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

webvtt_python-0.1.1-py3-none-any.whl (8.3 kB view details)

Uploaded Python 3

File details

Details for the file webvtt_python-0.1.1.tar.gz.

File metadata

  • Download URL: webvtt_python-0.1.1.tar.gz
  • Upload date:
  • Size: 103.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.6.1

File hashes

Hashes for webvtt_python-0.1.1.tar.gz
Algorithm Hash digest
SHA256 e3e7d603115e27d4c053d51f96d1e72aabeffa4ef2d6a2a17de66c7c130f7f07
MD5 6128addf5e5ba201b17c9ee2a333f5c5
BLAKE2b-256 e332700dc3aec246a573240c4c2d35dbdc8d0d79a6c5d9813aeb5f21926af449

See more details on using hashes here.

File details

Details for the file webvtt_python-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for webvtt_python-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f195143e9ae0fe252ce32645cd4fe94e9c6ddb22f06116b7f5ba1fc3da42ffe9
MD5 8c3f7a367d6fda8c191b308ff0e1feb9
BLAKE2b-256 dbcc4006890b43e7b4ee4fe6522a7b33875e582f17d9fe81d3e860d4ab226b81

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page