Python WebVTT API Implementation
Project description
WebVTT Python Parser
A Python implementation of the WebVTT (Web Video Text Tracks) format parser with strict validation and error handling.
Features
- Full compliance with W3C WebVTT specification
- Strict and lenient parsing modes
- Comprehensive model validation:
- Timestamp format validation
- Cue timing consistency checks
- Region setting validation
- Position/size value range checks
- Detailed error reporting with context
- Support for:
- Header metadata
- Regions with scroll settings
- Cue positioning and alignment
- Multi-line cues
- Voice spans and basic styling
Installation
pip install webvtt-python
Or using uv for faster installation:
uv add webvtt-python
Quick Start
from webvtt_python import WebVTTParser, WebVTT
# Parse from string
parser = WebVTTParser(strict=True)
content = """WEBVTT
00:00:01.000 --> 00:00:02.000
Hello world!
00:00:02.500 --> 00:00:05.000 position:50%
Multi-line
subtitle
"""
webvtt: WebVTT = parser.parse(content)
for cue in webvtt.cues:
print(f"{cue.start_time:.1f}-{cue.end_time:.1f}s: {cue.text}")
Advanced Usage
Region Handling
content = """WEBVTT
REGION
id:test
width:40%
lines:3
regionanchor:0%,100%
viewportanchor:10%,90%
"""
webvtt = parser.parse(content)
region = webvtt.regions[0]
print(f"Region {region.id}: {region.width}% width, {region.lines} lines")
Error Handling
try:
parser.parse("00:00:01.000 --> 00:00:00.500\nInvalid timing")
except ValueError as e:
print(f"Validation error: {e}")
Cue Settings
cue = webvtt.cues[0]
print(f"Position: {cue.position}%")
print(f"Alignment: {cue.text_alignment.value}")
print(f"Writing direction: {cue.writing_direction}")
Architecture
The WebVTT parser implements the W3C WebVTT specification.
graph TD
subgraph "Parser Pipeline"
A[".vtt File/String"] --> B["WebVTTParser"]
B --> C{"Valid<br>WEBVTT?"}
C -->|No| D["MalformedVTTError"]
C -->|Yes| E["Header Processing"]
E --> F["Style & Region<br>Parsing"]
F --> G["Cue Processing"]
G --> H{"Validation"}
H -->|Invalid| I["ValidationError"]
H -->|Valid| J["WebVTTFile"]
J --> K["Output Formats"]
K --> L1["JSON"]
K --> L2["SRT"]
K --> L3["WebVTT"]
end
classDef default fill:#2A2A2A,stroke:#666,color:#DDD
classDef process fill:#4a90e2,stroke:#2171C7,color:white
classDef error fill:#e74c3c,stroke:#c0392b,color:white
classDef decision fill:#f39c12,stroke:#d35400,color:white
classDef output fill:#2ecc71,stroke:#27ae60,color:white
class A,B default
class C,H decision
class D,I error
class E,F,G process
class J,K,L1,L2,L3 output
Key Components
-
Input Processing
- File or string input
- WEBVTT validation
- BOM handling
-
Content Parsing
- Header and metadata
- Styles and regions
- Cue timing and text
-
Output Options
- JSON serialization
- SRT conversion
- WebVTT formatting
API Reference
WebVTTParser
WebVTTParser(strict: bool = True)
strict: Raise errors for invalid content (default True)
Methods:
parse(content: str | TextIO) -> WebVTT
WebVTT Model
class WebVTT:
cues: List[WebVTTCue]
regions: List[WebVTTRegion]
styles: List[str]
header_comments: List[str]
WebVTTCue
class WebVTTCue:
start_time: float
end_time: float
text: str
identifier: Optional[str]
region: Optional[str]
position: Optional[float]
size: float
text_alignment: TextAlignment
# ... other properties
WebVTTRegion
class WebVTTRegion:
id: str
width: float
lines: int
region_anchor: Tuple[float, float]
viewport_anchor: Tuple[float, float]
scroll: str
Development
git clone https://github.com/yourusername/webvtt-python.git
cd webvtt-python
uv venv
source .venv/bin/activate
uv sync --system
Running Tests
uv run pytest tests/ -v
License
MIT License
Contributing
Contributions welcome! Please open an issue first to discuss proposed changes.
Acknowledgments
- W3C WebVTT specification team
- Python datetime module for timestamp parsing inspiration
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file webvtt_python-0.1.0.tar.gz.
File metadata
- Download URL: webvtt_python-0.1.0.tar.gz
- Upload date:
- Size: 103.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
38870592a80d3dd35b30f0683db4e45bebff57cd3075cd7028347757725ceb8e
|
|
| MD5 |
4d85c1d8d830f6680bba64525f592d48
|
|
| BLAKE2b-256 |
f660b50bee018a0458c92b2c3b404a5b8ebdd16b0b1cef25866b6ed6ab2989b0
|
File details
Details for the file webvtt_python-0.1.0-py3-none-any.whl.
File metadata
- Download URL: webvtt_python-0.1.0-py3-none-any.whl
- Upload date:
- Size: 8.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5a4e68f6bc8afe13f999ae7c3ca39973434132be568106577c96a7dd077dfdd
|
|
| MD5 |
b91d84448578a47f085be8468ed2d35b
|
|
| BLAKE2b-256 |
f7629dc1eed7ecf76f777b6c4452e0b5a3f9870c82eaa8ad195868a2a5eee65f
|