Skip to main content

Convert transcript JSONs to/from IETF World Transcription Format (WTF)

Project description

vCon WTF

A Python library for converting transcript JSONs to/from the IETF World Transcription Format (WTF). This library provides seamless conversion between different transcription provider formats and the standardized WTF format, with full integration support for vCon containers.

Features

  • Multi-Provider Support: Convert transcripts from Whisper, Deepgram, AssemblyAI, Rev.ai, Canary (NVIDIA), Parakeet (NVIDIA), Google Cloud Speech-to-Text, Amazon Transcribe, Azure Speech Services, and more
  • WTF Compliance: Full adherence to the IETF World Transcription Format specification
  • vCon Integration: Seamless integration with vcon-lib for vCon container operations
  • Command-Line Interface: Easy-to-use CLI for batch conversion and validation
  • Comprehensive Validation: Robust validation of WTF documents and input formats
  • Quality Metrics: Built-in quality assessment and confidence score normalization

Installation

# Install from PyPI (when available)
pip install vcon-wtf

# Or install from source
git clone https://github.com/vcon-dev/wtf-transcript-converter.git
cd wtf-transcript-converter
uv sync

Quick Start

Basic Usage

from wtf_transcript_converter import WTFDocument, validate_wtf_document
from wtf_transcript_converter.providers import WhisperConverter

# Convert Whisper output to WTF
whisper_data = {...}  # Your Whisper JSON data
converter = WhisperConverter()
wtf_doc = converter.convert(whisper_data)

# Validate the WTF document
is_valid, errors = validate_wtf_document(wtf_doc)
if is_valid:
    print("WTF document is valid!")
else:
    print(f"Validation errors: {errors}")

Command-Line Usage

# Convert to WTF format
vcon-wtf to-wtf input.json --provider whisper --output output.json

# Convert from WTF format
vcon-wtf from-wtf input.json --provider deepgram --output output.json

# Validate WTF document
vcon-wtf validate input.json

# Batch conversion
vcon-wtf batch --input-dir ./transcripts --output-dir ./wtf --provider auto

vCon Integration

from vcon import Vcon
from wtf_transcript_converter import VConWTFAttachment

# Create vCon container with WTF transcription
vcon = Vcon()
wtf_attachment = VConWTFAttachment.create_from_wtf(wtf_doc)
vcon = wtf_attachment.add_to_vcon(vcon)

Supported Providers

  • Whisper: OpenAI's speech recognition system
  • Deepgram: Real-time speech-to-text API
  • AssemblyAI: AI-powered transcription service
  • Rev.ai: Professional transcription service
  • Canary: NVIDIA's advanced speech recognition model (via Hugging Face)
  • Parakeet: NVIDIA's efficient transducer-based model (via Hugging Face)
  • Google Cloud Speech-to-Text: Google's speech recognition service (planned)
  • Amazon Transcribe: AWS speech-to-text service (planned)
  • Azure Speech Services: Microsoft's speech recognition platform (planned)
  • Speechmatics: Real-time and batch speech recognition (planned)

Hugging Face Integration

The library includes support for NVIDIA's Canary and Parakeet models via Hugging Face:

Setup for Hugging Face Models

# Install with Hugging Face dependencies
uv add --group integration transformers torch librosa soundfile

# Set your Hugging Face token (optional, for gated models)
export HF_TOKEN=your_huggingface_token_here

Using Canary and Parakeet

from wtf_transcript_converter.providers import CanaryConverter, ParakeetConverter

# Canary converter
canary_converter = CanaryConverter()
canary_result = canary_converter.transcribe_audio("audio.wav", language="en")
wtf_doc = canary_converter.convert_to_wtf(canary_result)

# Parakeet converter
parakeet_converter = ParakeetConverter()
parakeet_result = parakeet_converter.transcribe_audio("audio.wav", language="en")
wtf_doc = parakeet_converter.convert_to_wtf(parakeet_result)

Command Line Usage

# Convert with Canary
vcon-wtf to-wtf canary_output.json --provider canary --output result.wtf.json

# Convert with Parakeet
vcon-wtf to-wtf parakeet_output.json --provider parakeet --output result.wtf.json

Cross-Provider Testing

The library includes comprehensive cross-provider testing capabilities:

Consistency Testing

# Test consistency across all providers
vcon-wtf cross-provider consistency input.json --verbose

# Generate detailed consistency report
vcon-wtf cross-provider consistency input.json --output consistency_report.json

Performance Benchmarking

# Benchmark performance across providers
vcon-wtf cross-provider performance input.json --iterations 5

# Compare conversion speed and resource usage
vcon-wtf cross-provider performance input.json --output performance_report.json

Quality Comparison

# Compare quality metrics across providers
vcon-wtf cross-provider quality input.json --verbose

# Generate quality analysis report
vcon-wtf cross-provider quality input.json --output quality_report.json

Comprehensive Testing

# Run all cross-provider tests
vcon-wtf cross-provider all input.json --output-dir reports/

# This generates:
# - reports/consistency_report.json
# - reports/performance_report.json
# - reports/quality_report.json

Development

Setup Development Environment

git clone https://github.com/vcon-dev/wtf-transcript-converter.git
cd wtf-transcript-converter
uv sync --dev
uv run pre-commit install

Running Tests

uv run pytest
uv run pytest --cov=src/wtf_transcript_converter --cov-report=html

Code Quality

uv run black src tests
uv run isort src tests
uv run flake8 src tests
uv run mypy src

Contributing

We welcome contributions! Please see our Contributing Guide for details.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Links

Acknowledgments

  • vCon Working Group for the WTF specification
  • Transcription provider communities for format insights
  • IETF for standardization framework

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcon_wtf-0.1.1.tar.gz (288.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vcon_wtf-0.1.1-py3-none-any.whl (56.3 kB view details)

Uploaded Python 3

File details

Details for the file vcon_wtf-0.1.1.tar.gz.

File metadata

  • Download URL: vcon_wtf-0.1.1.tar.gz
  • Upload date:
  • Size: 288.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for vcon_wtf-0.1.1.tar.gz
Algorithm Hash digest
SHA256 310210a703887ba450c6991fbd243a5893da288ce65d77c5ceca75c8dec7e346
MD5 d8ffcfed11959868001a863568f8a833
BLAKE2b-256 03352e0cc35b7b4b5281bcf60f3dbefa7dbea9f53c2c610a9804a401769508c9

See more details on using hashes here.

File details

Details for the file vcon_wtf-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: vcon_wtf-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 56.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.11

File hashes

Hashes for vcon_wtf-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 13a9747e4298e92f953a4df0d74d307482c527cdeb176d9e64f865d49b044279
MD5 821ffabd6ec8af106ac6a9ae3264aa3f
BLAKE2b-256 7231b762f6330da88e8f4492c577d74a45ba341bc9b6ed5fe56d595291786b10

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page