Skip to main content

A Python wrapper for the Standard Transcription JSON (STJ) format.

Project description

STJLib

PyPI Build Status Documentation Status

A Python library for the Standard Transcription JSON (STJ) format.

Overview

STJLib provides data classes and utilities for working with STJ files, which are used to represent transcribed audio and video data in a structured, machine-readable JSON format.

For more information about the STJ format, please refer to the STJ Specification.

Documentation

Full documentation is available at stjlib.readthedocs.io. This includes:

  • Detailed API reference
  • Usage examples
  • Advanced usage guides
  • Contributing guidelines

Features

  • Full support for STJ format version 0.6.0
  • Comprehensive validation system with severity levels (ERROR, WARNING, INFO)
  • Time value precision handling with IEEE 754 round-to-nearest-even
  • Strict language code validation (ISO 639-1/639-3)
  • Support for zero-duration segments
  • Word timing modes (complete, partial, none)
  • Enhanced speaker and style validation
  • Extensions field validation with reserved namespace protection

Quick Start

Installation

pip install stjlib

Basic Usage

from stjlib import StandardTranscriptionJSON

# Load and validate an existing STJ file
stj = StandardTranscriptionJSON.from_file('path/to/file.stjson', validate=True)

# Or create a new STJ document
stj = StandardTranscriptionJSON(version="0.6.0")

# Add transcriber information
stj.metadata.transcriber = {
    "name": "TestTranscriber",
    "version": "1.0"
}

# Add a simple segment
segment = {
    "start": 0.0,
    "end": 2.0,
    "text": "Hello world"
}
stj.transcript.segments.append(segment)

# Save to file
stj.save('output.stjson')

# Access metadata and transcript data
print(stj.metadata)
print(stj.transcript)

For more examples and detailed usage instructions, please refer to our documentation.

File Format Support

STJLib supports the Standard Transcription JSON (STJ) format with the following file extensions:

  • Primary (Recommended): .stjson
  • Alternative: .stj
  • Alternative: .stj.json (systems supporting double extensions)

MIME Type: application/vnd.stj+json

Validation Features

  • Severity levels: ERROR, WARNING, INFO
  • Detailed location information in error messages
  • Time value precision validation
  • Language code validation (ISO 639-1/639-3)
  • Segment ordering and overlap validation
  • Speaker and style validation
  • URI format validation
  • Extensions validation with namespace protection

Development

Setting Up Development Environment

# Clone the repository
git clone https://github.com/yaniv-golan/stjlib.git
cd stjlib

# Install development dependencies
pip install -e .
pip install -r requirements-dev.txt

Running Tests

pytest

Building Documentation Locally

cd docs
make html

The documentation will be available in docs/build/html.

Contributing

We welcome contributions to stjlib! Please see our Contributing Guide for more details on how to get started.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stjlib-0.5.0.tar.gz (47.8 kB view details)

Uploaded Source

Built Distribution

stjlib-0.5.0-py3-none-any.whl (39.9 kB view details)

Uploaded Python 3

File details

Details for the file stjlib-0.5.0.tar.gz.

File metadata

  • Download URL: stjlib-0.5.0.tar.gz
  • Upload date:
  • Size: 47.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for stjlib-0.5.0.tar.gz
Algorithm Hash digest
SHA256 50915d104a147cef5d3489145f2de5908aee7f177b117f2b08317edec9e1b36b
MD5 2c91868fb6212c651f24ad439d845e75
BLAKE2b-256 49ed8487e0ca5c3ec33487f91dacd2a07d1542adde31ecdb484db3ca2de7bb57

See more details on using hashes here.

File details

Details for the file stjlib-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: stjlib-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 39.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for stjlib-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 67744eec3a069f25033f0b0e644c6517d812461fae95690db9250d64cff0613f
MD5 9d17076d230da43027b704a2f33589f5
BLAKE2b-256 cae8d3c4c7ea730d057af9a44c21949ac7d6601242c42c921114148b259cff9b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page