A Python wrapper for the Standard Transcription JSON (STJ) format.
Project description
STJLib
A Python library for the Standard Transcription JSON (STJ) format.
Overview
STJLib provides data classes and utilities for working with STJ files, which are used to represent transcribed audio and video data in a structured, machine-readable JSON format.
For more information about the STJ format, please refer to the STJ Specification.
Documentation
Full documentation is available at stjlib.readthedocs.io. This includes:
- Detailed API reference
- Usage examples
- Advanced usage guides
- Contributing guidelines
Features
- Full support for STJ format version 0.6.0
- Comprehensive validation system with severity levels (ERROR, WARNING, INFO)
- Time value precision handling with IEEE 754 round-to-nearest-even
- Strict language code validation (ISO 639-1/639-3)
- Support for zero-duration segments
- Word timing modes (complete, partial, none)
- Enhanced speaker and style validation
- Extensions field validation with reserved namespace protection
Quick Start
Installation
pip install stjlib
Basic Usage
from stjlib import StandardTranscriptionJSON
# Load and validate an existing STJ file
stj = StandardTranscriptionJSON.from_file('path/to/file.stjson', validate=True)
# Or create a new STJ document
stj = StandardTranscriptionJSON(version="0.6.0")
# Add transcriber information
stj.metadata.transcriber = {
"name": "TestTranscriber",
"version": "1.0"
}
# Add a simple segment
segment = {
"start": 0.0,
"end": 2.0,
"text": "Hello world"
}
stj.transcript.segments.append(segment)
# Save to file
stj.save('output.stjson')
# Access metadata and transcript data
print(stj.metadata)
print(stj.transcript)
For more examples and detailed usage instructions, please refer to our documentation.
File Format Support
STJLib supports the Standard Transcription JSON (STJ) format with the following file extensions:
- Primary (Recommended):
.stjson
- Alternative:
.stj
- Alternative:
.stj.json
(systems supporting double extensions)
MIME Type: application/vnd.stj+json
Validation Features
- Severity levels: ERROR, WARNING, INFO
- Detailed location information in error messages
- Time value precision validation
- Language code validation (ISO 639-1/639-3)
- Segment ordering and overlap validation
- Speaker and style validation
- URI format validation
- Extensions validation with namespace protection
Development
Setting Up Development Environment
# Clone the repository
git clone https://github.com/yaniv-golan/stjlib.git
cd stjlib
# Install development dependencies
pip install -e .
pip install -r requirements-dev.txt
Running Tests
pytest
Building Documentation Locally
cd docs
make html
The documentation will be available in docs/build/html
.
Contributing
We welcome contributions to stjlib! Please see our Contributing Guide for more details on how to get started.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contact
- For bugs and feature requests, please open an issue
- For other questions, start a GitHub Discussion
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file stjlib-0.5.0.tar.gz
.
File metadata
- Download URL: stjlib-0.5.0.tar.gz
- Upload date:
- Size: 47.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 50915d104a147cef5d3489145f2de5908aee7f177b117f2b08317edec9e1b36b |
|
MD5 | 2c91868fb6212c651f24ad439d845e75 |
|
BLAKE2b-256 | 49ed8487e0ca5c3ec33487f91dacd2a07d1542adde31ecdb484db3ca2de7bb57 |
File details
Details for the file stjlib-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: stjlib-0.5.0-py3-none-any.whl
- Upload date:
- Size: 39.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.12.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67744eec3a069f25033f0b0e644c6517d812461fae95690db9250d64cff0613f |
|
MD5 | 9d17076d230da43027b704a2f33589f5 |
|
BLAKE2b-256 | cae8d3c4c7ea730d057af9a44c21949ac7d6601242c42c921114148b259cff9b |